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I— ^ pairing /unpairing operations. 

Qh An embedded higher order combinator language provides any-to-any en- 

(/j codings automatically. 

O Besides applications to experimental mathematics, a few examples of 

"free algorithms" obtained by transferring operations between data types 
are shown. Other applications range from stream iterators on combina- 
torial objects to self-delimiting codes, succinct data representations and 
generation of random instances. 

The paper covers 59 data types and, through the use of the embedded 
combinator language, provides 3540 distinct bijective transformations 
between them. 
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Haskell program, is available at |http: //logic . csci .unt . edu/tarau/| 
00 |research/2008/f ISO .zip 

A short, 5 page version of the paper, published as [Tj describes the idea 
^ of organizing various data transformations as encodings to sequences of 
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encodings to related hereditarily finite universes. 
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1 Introduction 

Analogical/metaphorical thinking routinely shifts entities and operations from 
a field to another hoping to uncover similarities in representation or use [2]. 



Compilers convert programs from human centered to machine centered rep- 
resentations - sometime reversibly. 

Complexity classes are defined through compilation with limited resources 
(time or space) to similar problems |3l4j . 

Mathematical theories often borrow proof patterns and reasoning techniques 
across close and sometime not so close fields. 

A relatively small number of universal data types are used as basic building 
blocks in programming languages and their runtime interpreters, correspond- 
ing to a few well tested mathematical abstractions like sets, functions, graphs, 
groups, categories etc. 

A less obvious leap is that if heterogeneous objects can be seen in some way 
as isomorphic, then we can share them and compress the underlying informa- 
tional universe by collapsing isomorphic encodings of data or programs whenever 
possible. 

Sharing heterogeneous data objects faces two problems: 

— some form of equivalence needs to be proven between two objects A and 
B before A can replace B in a data structure, a possibly tedious and error 
prone task 

— the fast growing diversity of data types makes harder and harder to recognize 
sharing opportunities. 

Besides, this rises the question: what guaranties do we have that sharing 
across heterogeneous data types is useful and safe? 

The techniques introduced in this paper provide a generic solution to these 
problems, through isomorphic mappings between heterogeneous data types, such 
that unified internal representations make equivalence checking and sharing pos- 
sible. The added benefit of these "shapeshifting" data types is that the functors 
transporting their data content will also transport their operations, resulting in 
shortcuts that provide, for free, implementations of interesting algorithms. The 
simplest instance is the case of isomorphisms - reversible mappings that also 
transport operations. In their simplest form such isomorphisms show up as en- 
codings to some simpler and easier to manipulate representation, for instance 
natural numbers. 

Such encodings can be traced back to Godel numberings [5f5] associated to 
formulae, but a wide diversity of common computer operations, ranging from 
data compression and serialization to wireless data transmissions and crypto- 
graphic codes quahfy. 

Encodings between data types provide a variety of services ranging from free 
iterators and random objects to data compression and succinct representations. 
Tasks like serialization and persistence are facilitated by simplification of reading 
or writing operations without the need of special purpose parsers. Sensitivity 
to internal data representation format or size limitations can be circumvented 
without extra programming effort. 



2 An Embedded Data Transformation Language 



We will start by designing an embedded transformation language as a set of 
operations on a groupoid of isomorphisms. We will then extended it with a set 
of higher order combinators mediating the composition of the encodings and the 
transfer of operations between data types. 

2.1 The Groupoid of Isomorphisms 

We implement an isomorphism between two objects X and Y as a Haskell data 
type encapsulating a bijection / and its inverse g. We will call the from function 
the first component (a section in category theory parlance) and the to function 
the second component (a retraction) defining the isomorphism. We can organize 
isomorphisms as a groupoid as follows: 
f = 9-' 



data Iso a b = Iso (a^b) (b^a) 

from (Iso f _) — f 
to (Iso _ g) = g 

compose : : Iso a b — > Iso b c 
compose (Iso f g) (Iso f g') 
itself — Iso id id 
invert (Iso f g) = Iso g f 

Assuming that for any pair of type Iso a h, f o g = ida and g o f — idb, we can 
now formulate laws about isomorphisms that can be used to test correctness of 
implementations with tools like QuickCheck [7]. 

Proposition 1 The data type Iso has a groupoid structure, i.e. the compose 
operation, when defined, is associative, itself acts as an identity element and 
invert computes the inverse of an isomorphism. 

We can transport operations from an object to another with borrow and lend 
combinators defined as follows: 

borrow : : Iso ts^ (t^t)^s^s 
borrow (Iso f g) h x = f (h (g x) ) 
borrow2 (Iso f g) h x y = f (h (g x) (g y)) 
borrowN (Iso f g) h xs = f (h (map g xs)) 

lend : : Iso st^ (t^t) -^s^s 
lend = borrow . invert 
lend2 = borrow2 . invert 
lendN = borrowN . invert 



— > Iso a c 

= Iso (f ' . f) (g . g') 



The combinators fit and retrofit just transport an object x through an 
isomorphism and and apply to it an operation op available on the other side: 

fit : : (b ^ c) ^ Iso a b ^ a — > c 
fit op iso X = op ((from iso) x) 

retrofit : : (a ^ c) ^ Iso a b ^ b — > c 
retrofit op iso x = op ((to iso) x) 

We can see the combinators from, to, compose, itself, invert, borrow, 
lend, fit etc. as pait of an embedded data transformation language. Note that 
in this design we borrow from our strongly typed host programming language its 
abstraction layers and safety mechanisms that continue to check the semantic 
validity of the embedded language constructs. 

2.2 Choosing a Root 

To avoid defining n{n— l)/2 isomorphisms between n objects, we choose a Root 
object to/from which we will actually implement isomorphisms. We will extend 
our embedded combinator language using the groupoid structure of the isomor- 
phisms to connect any two objects through isomorphisms to/from the Root. 

Choosing a Root object is somewhat arbitrary, but it makes sense to pick 
a representation that is relatively easy convertible to various others, efficiently 
implementable and, last but not least, scalable to accommodate large objects up 
to the runtime system's actual memory limits. 

We will choose as our Root object finite sequences of natural numbers. They 
can be seen as finite functions from an initial segment of Nat, say [0..n], 
to Nat. This implies that a finite function can be seen as an array or a list of 
natural numbers except that we do not limit the size of the representation of its 
values. We will represent them as lists i.e. their Haskell type is [Nat]. 

t3rpe Nat = Integer 
tjrpe Root = [Nat] 

We can now define an Encoder as an isomorphism connecting an object to Root 
tjrpe Encoder a = Iso a Root 

together with the combinators with and as providing an embedded transformation 
language for routing isomorphisms through two Encoders. 

with : : Encoder a^Encoder b^Iso a b 

with this that — compose this (invert that) 

as : : Encoder a — > Encoder b — > b — » a 

as that this thing = to (with that this) thing 

The combinator with turns two Encoders into an arbitrary isomorphism, i.e. 

acts as a connection hub between their domains. The combinator as adds a 
more convenient syntax such that converters between A and B can be designed 
as: 



a2b X = as A B X 
b2a X = as B A X 



a2b = as B A 



A 




B 



Root 



A particularly useful combinator that transports binary operations from an En- 
coder to another, borrow_f rom, can be defined as follows: 

borrow_from : : Encoder a ^ (a ^ a ^ a) Encoder b — > b ^ b ^ b 
borrow_from other op this x y = borrow2 (with other this) op x y 

Note that one can also use the more intuitive equivalent definition 

borrow_f rom' other op this x y = z where 
x' = as other this x 
y' = as other this y 
z' = op x' y' 
z — as this other z' 

given that the following equivalence always holds: 

borrow^from = borrow^from' (1) 

We will provide extensive use cases for these combinators as we populate our 
groupoid of isomorphisms. Given that [Nat] has been chosen as the root, we will 
define our finite function data type fun simply as the identity isomorphism on 
sequences in [Nat]. 

fun : : Encoder [Nat] 
fun — itself 



3 Extending the Groupoid of Isomorphisms 

We will now populate our groupoid of isomorphisms with combinators based on 
a few primitive converters. 

3.1 An Isomorphism between Finite Multisets and Finite Functions 

Multisets [8] are unordered collections with repeated elements. Non-decreasing 
sequences provide a canonical representation for multisets of natural numbers. 
The isomorphism between finite multisets and finite functions is specified with 
two bijections mset2fun and fun2mset. 



mset : : Encoder [Nat] 

mset = Iso mset2fun fun2mset 



While finite multisets and sequences representing finite functions share a com- 
mon representation [A^at], multisets are subject to the implicit constraint that 
their order is immaterial]^ This suggest that a multiset like [4, 4, 1, 3, 3, 3] could 
be represented by first ordering it as [1, 3, 3, 3,4,4] and then compute the differ- 
ences between consecutive elements i.e. [xq . . .Xi, x^+i . . .] ^ [xq . . . Xi+i — Xi . . .]. 
This gives [1,2,0,0,1,0], with the first element 1 followed by the increments 
[2, 0, 0, 1, 0], as implemented by mset2fun: 

mset2fun — to_diff s . sort . (map iimst_be_nat) 
to_diffs xs — zipWith (-) (xs) (0:xs) 
must_be_nat n | n>0 = n 

It can now be verified easily that incremental sums of the numbers in such a 
sequence return the original set in sorted form, as implemented by fun2mset: 

fun2mset ns — tail (scanl (+) (map must_be_nat ns)) 

The resulting isomorphism mset can be applied directly using its two components 
mset2fun and fun2mset. Equivalently, it can be expressed more "generically" 
by using the as combinator, as follows: 

*ISO> mset2fun [1,3,3,3,4,4] 
[1,2,0,0,1,0] 

*ISO> fun2mset [1,2,0,0,1,0] 
[1,3,3,3,4,4] 

*ISO> as fun mset [1,3,3,3,4,4] 
[1,2,0,0,1,0] 

*ISO> as mset fun [1,2,0,0,1,0] 
[1,3,3,3,4,4] 



3.2 An Isomorphism to Finite Sets of Natural Numbers 

While finite sets and sequences share a common representation [iVat], sets are 
subject to the implicit constraints that all their elements are distinct and order 
is immaterial. Like in the case of multisets, this suggest that a set like {7, 1, 4, 3} 
could be represented by first ordering it as {1,3,4,7} and then compute the 
differences between consecutive elements. This gives [1,2,1,3], with the first 
element 1 followed by the increments [2, 1,3]. To turn it into a bijection, including 
as a possible member of a sequence, another adjustment is needed: elements in 
the sequence of increments should be replaced by their predecessors. This gives 
[1, 1, 0, 2] as implemented by set2fun: 

set2fun xs | is_set xs — shift_tail pred (mset2fun xs) 



^ Such constraints can be regarded as laws that we assume about a given data type, 
wlien needed, restricting it to the appropriate domain of the underlying mathemat- 
ical concept. 



shift.tail _[] = [] 

shift.tail f (x:xs) = x:(map f xs) 

is_set ns = ns=^ub ns 

It can now be verified easily that predecessors of the incremental sums of the 
successors of numbers in such a sequence, return the original set in sorted form, 
as implemented by fun2set: 

fim2set = (map pred) . fvm2mset . (map succ) 

The Encoder (an isomorphism with fun) can be specified with the two bijections 
set2fun and fuii2set. 

set : : Encoder [Nat] 

set — Iso set2fun fun2set 

The Encoder (set) is now ready to interoperate with another Encoder; 

*ISQ> as fun set [0,2,3,4,9] 

[0,1,0,0,4] 

*ISO> as set fun [0,1,0,0,4] 
[0,2,3,4.9] 

*ISO> as mset set [0,2,3,4,9] 
[0,1,1,1,5] 

*ISO> as set mset [0,1,1,1,5] 
[0,2,3,4,9] 

As the example shows, the Encoder set connects arbitrary lists of natural num- 
bers representing finite functions to strictly increasing sequences of (distinct) 
natural numbers representing sets. Then, through the use of the combinator as, 
sets represented by set are connected to multisets represented by mset. This 
connection is (implicitly) routed through a connection to fun, as if 

*ISO> as mset fun [0,1,0,0,4] 
[0,1,1,1,5] 

were executed. 

3.3 Folding Sets into Natural Numbers 

We can fold a set, represented as a list of distinct natural numbers into a sin- 
gle natural number, reversibly, by observing that it can be seen as the list of 
exponents of 2 in the number's base 2 representation. 

nat_set = Iso nat2set set2nat 

nat2set n | n>0 = nat2exps n where 
nat2exps _ = [] 
nat2exps n x = 

if (even n) then xs else (x:xs) where 
xs=n.at2exps (n 'div' 2) (succ x) 



set2nat ns | is_set ns = sum (map (2") ns) 



We will standardize this pair of operations as an Encoder for a natural number 
using our Root as a mediator: 

nat : : Encoder Nat 

nat — compose nat_set set 

Given that nat is an isomorphism with the Root fun, one can use directly its 
from and to components: 

*ISO> from nat 2008 
[3,0,1,0.0,0,0] 
*ISO> to nat it 
2008 

Moreover, the resulting Encoder (nat) is now ready to interoperate with any 

Encoder, in a generic way: 

*ISD> as fun nat 2008 
[3,0,1,0,0,0,0] 
*ISO> as set nat 2008 
[3,4,6,7,8,9,10] 

*ISO> as nat set [3,4,6,7,8,9,10] 
2008 

*ISO> lend nat reverse 2008 
1135 

*ISO> lend nat_set reverse 2008 
2008 

*ISO> borrow nat_set succ [1,2,3] 
[0,1,2,3] 

*ISCI> as set nat 42 
[1,3,5] 

*ISO> fit length nat 42 
3 

*ISO> retrofit succ nat_set [1,3,5] 
43 

The reader might notice at this point that we have already made full circle 
- as finite sets can be seen as instances of finite sequences. Injective functions 

that arc not surjcctions with wider and wider gaps can be generated using the 
fact that one of the representations is information theoretically "denser" than 
the other, for a given range: 

*ISO> as set fun [0,1,2,3] 
[0,2,5,9] 

*ISD> as set fun $ as set fun [0,1,2,3] 

[0,3,9,19] 

*ISO> as set fun $ as set fun $ as set fun [0,1,2,3] 
[0,4,14,34] 

One can now define, for instance, a mapping from natural numbers to multi-sets 
simply as: 

nat2mset = as mset nat 
mset2nat = as nat mset 



but we will not explicitly need such definitions as the the equivalent function is 
clearly provided by the combinator as. One can now borrow operations between 
set and nat as follows: 

*ISO> borrow.from set union nat 42 2008 
2042 

*ISO> 42 . I . 2008 : : Nat 
2042 

*ISO> borrow.from set intersect nat 42 2008 
8 

*ISQ> 42 .&. 2008 : : Nat 

8 

*ISO> borrow_from nat (*) set [1,2,3] [4,5] 
[5,7,9] 

*ISO> borrow.from nat (+) set [1,2,3] [3,4,5] 
[1,2,6] 

and notice that operations hke union and intersection of sets map to boolean 
operations on numbers as expected, while other operations are not necessarily 
meaningful at first sight. We will show next a few cases where such "shapshift- 
ings" of operations reveal more interesting analogies. 



3.4 Encoding Finite ]V[ultisets with Primes 

A factorization of a natural number is uniquely described as multi-set or primes. 
We will use the fact that each prime number is uniquely associated to its position 
in the infinite stream of primes to obtain a bijection from multisets of natural 
numbers to natural numbers. We assume defined a prime generator primes and 
a factoring function to_f actors (see Appendix). 

The function nat2pniset maps a natural number to the multiset of prime 
positions in its factoring. Note that we treat as [] and shift n to n+l to 
accomodate and 1, to which prime factoring operations do not apply. 

nat2pmset = [] 

nat2pmset n = map (to_pos_in (h:ts)) (to_factors (nrfl) h ts) where 
(h:ts)=genericTake (nrfl) primes 

to_pos_in xs X = fromlntegral i where 
Just i=elemln<iex x xs 

The function pmset2nat maps back a multiset of positions of primes to the 
result of the product of the corresponding primes. Again, we map [] to and 
shift back by 1 the result. 

pmset2nat [] = 

pmset2nat ns = (product ks)-l where 
ks=map (from_pos_in ps) ns 
ps=^rimes 

f rom_pos_in xs n = xs ! ! (fromlntegral n) 



We obtain the Encoder: 



pmset : : Encoder [Nat] 

pmset = compose (Iso pmset2nat nat2pmset) nat 

working as follows: 

*ISO> as pmset nat 2008 
[3,3,12] 

*ISO> as nat pmset it 
2008 

*ISO> map (as pmset nat) [0..7] 

[[] , [0] , [1] , [0,0] , [2] , [0,1] , [3] , [0,0,0]] 

Note that the mappings from a set or sequence to a number work in time and 
space linear in the bitsize of the number. On the other hand, as prime num- 
ber enumeration and factoring are involved in the mapping from numbers to 
multisets this encoding is intractable for all but small values. 

We are now ready to "shapeshift" between data types while watching for 
interesting landscapes to show up. 

3.5 Exploring the analogy between multiset decompositions and 
factoring 

As natural numbers can be uniquely represented as a multiset of prime factors 
and, independently, they can also be represented as a multiset with the Encoder 
mset described in subsection |3.1| the following question arises naturally: 

Can in any way the "easy to reverse" encoding mset emulate or predict prop- 
erties of the the difficult to reverse factoring operation? 

The first step is to define an analog of the multiplication operation in terms of 
the computationally easy multiset encoding mset. Clearly, it makes sense to take 
inspiration from the fact that factoring of an ordinary product of two numbers 
can be computed by concatenating the multisets of prime factors of its operands. 

mprod = borrow_from mset (4+) nat 

Proposition 2 < N, mprod, > is a commutative monoid i.e. mprod is defined 
for all pairs of natural numbers and it is associative, commutative and has as 
an identity element. 

After rewriting the definition of mprod as the equivalent: 

mprod_alt n m = as nat mset ((as mset nat n) -|-f (as mset nat m)) 

the proposition follows immediately from the associativity of the concatenation 
operation and the order independence of the multiset encoding provided by mset. 

We can derive an exponentiation operation as a repeated application of 
mprod: 

mexp n = 

mexp n k = mprod n (mexp n (k-1)) 



Here are a few examples comparing mprod to ordinary multiplication and 
exponentiation: 

*ISO> mprod 41 (mprod 33 88) 
3539 

*ISQ> mprod (mprod 41 33) 88 

3539 

*ISO> mprod 33 46 
605 

*ISO> mprod 46 33 
605 

*ISO> mprod 712 
712 

*ISO> mprod 5513 
5513 

*ISO> (41*33) *88 
119064 

*ISO> 41* (33*88) 
119064 
*ISQ> 33*46 
1518 

*ISO> 46*33 
1518 

*ISO> 1*712 
712 

*ISO> 5513*1 
5513 

*ISO> map (Ax— >mexp x 2) [0..15] 

[0,3,6,15,12,27,30,63,24,51,54,111,60,123,126,255] 
*ISQ> map (Ax^x"2) [0..15] 

[0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225] 

Note also that any multiset encoding of natural numbers can be used to define 
a similar commutative monoid structure. In the case of pmset we obtain: 

pmprod n m = as nat pmset ((as pmset nat n) -H- (as pmset nat m)) 

If one defines: 

pmprod' n m = (iir|-l)*(m+-l)-l 

it follows immediately from the definition of mprod that: 

pmprod = pmprod' (2) 

This is useful as computing pmprod ' is easy while computing mprod is intractable 
for large values. This brings us back to observe that: 

Proposition 3 < N, pmprod, > is a commidative monoid i.e. pmprod is de- 
fined for all pairs of natural numbers and it is associative, commutative and has 
as an identity element. 



Fig. [T] compares the shapes of pmprod' (virtually the same as ordinary mul- 
tiplication) and mprod for operands in [0..2^ — 1]. One can see the contrast 
between the regular shape of ordinary multiplication and the recursively "self- 
similar" landscape induced by mprod. 

One can also bring mprod closer to ordinary multiplication by defining 

mprod' _ = 
mprod' _ = 

mprod' m n = (mprod (n-l) (m-1)) + 1 
mexp' n = 1 

mexp' n k = mprod' n (mexp' n (k-1)) 

and by observing that they correlate as follows: 

*ISO> map (Ax^mexp' x 2) [0..16] 

[0,1,4,7,16,13,28,31,64,25,52,55,112,61,124,127,256] 
* ISO map (Ax^x"2) [0..16] 

[0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256] 
[0,1,8,15,64,29,120,127,512,57,232,239,960,253,1016,1023,4096] 
*ISO> map (Ax^x"3) [0..16] 

[0 , 1 , 8 , 27 , 64 , 125 , 216 , 343 , 512 , 729 , 1000 , 1331 , 1728 , 2197 , 2744 , 3375 , 4096] 

Fig. [2] shows that values for mexp ' follow from below those of the function 
and that equality only holds when x is a power of 2. 

Note that the structure induced by mprod' is similar to ordinary multiplica- 
tion: 

Proposition 4 < N, mprod', 1 > is a commutative monoid i.e. mprod' is de- 
fined for all pairs of natural numbers and it is associative, commutative and has 
1 as an identity element. 

Interestingly, mprod ' coincides with ordinary multiplication if one of the operands 
is a power of 2. More precisely, the following holds: 

Proposition 5 mprod' x y = x * y if and only if 3n > such that x — 2" or 
y = 2". Otherwise, mprod' x y < x * y. 

Fig. [3] shows the (scaled up by 1000) self-similar landscape generated by the 
[0..1]-valued function (mprod' x y) / (x*y) 

Besides the connection with products, natural mappings worth investigating 
are the analogies between multiset intersection and gcd of the corresponding 
numbers or between multiset union and the 1cm of the corresponding numbers. 
Assuming the definitions of multiset operations provided in the Appendix, one 
can define: 

mgcd : : Nat — > Nat — > Nat 

mgcd = borrow_f rom mset msetlnter nat 

mlcm : : Nat Nat — > Nat 

mlcm = borrow_from mset msetUnion nat 
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Fig. 1: multiplication vs mprod: pmprod' and mprod 
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Fig. 2: Square vs. mexp' n 2 
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Fig. 3: Ratio between mprod' and product 



mdiv : : Nat Nat Nat 

mdiv = borrow_from mset msetDif nat 



and note that properties similar to usual arithmetic operations hold: 



mprod{mgcd x y){mlcm x y) = mprod x y 



(3) 



mdiv (mprod x y) y = x 



(4) 



m,div{mprod x y) x = y 



(5) 



While mprod, mprod' ,pmprod' and pmprod arc not distributive with ordinary 
addition, it looks like an interesting problem to find for each of them compatible 
additive operations. 

3.6 Unfolding Natural Numbers into Bitstrings 

The isomorphism between natural numbers and bitstring is well known, except 
that it is usually ignored that conventional bit representations of integers need 
a twist to be mapped one-to-one to arbitrary sequences of Os and Is. As the 
usual binary representation always has 1 as its highest digit, nat2bits will drop 
this bit, given that the length of the list of digits is (implicitly) known. This 
transformation (a variant of the so called bijective base n representation), brings 
us an isomorphism between Nat and the regular language {0, 1}*. 

bits : : Encoder [Nat] 

bits = compose (Iso bits2nat nat2bits) nat 

nat2bits = drop_last . (to_base 2) . succ 

drop_last bs= 

genericTake ((genericLength bs)-l) bs 

to_base base n = d : 

(if q=0 then [] else (to_base base q)) where 
(q,d) = quotRem n base 

bits2nat bs = pred (from_base 2 (bs -H- [1])) 

from_base base [] = 

from_base base (x:xs) | x>0 && x<base = 
x-fbase* (f rom_base base xs) 

Note also that, strictly speaking, this is only an isomorphism when the digits 

in the bitlist are in {0, 1}, therefore we shall assume this constraint as a law 
governing this Encoder. The following examples show two conversion operations 
and bits borrowing a multiplication operation from nat. 



*ISO> as bits nat 42 
[1,1,0,1,0] 

*ISO> as nat bits [1,1,0,1,0] 
42 

*ISO> borrow2 (with nat bits) (*) [1,1,0] [1,0,1,1] 
[1,0,0,1,1,0,0,0] 

The reader might notice at this point that we have made full circle again - 
as bitstrings can be seen as instances of finite sequences. Injective functions that 
are not surjections with wider and wider gaps can be generated by composing 
the as combinators: 

*ISO> as bits fun [1,1] 
[1,1,0] 

*ISO> as bits fun (as bits fun [1,1]) 
[1,1,0,1] 

*ISO> as bits fun $ as bits fun $ as bits fun [1,1] 
[1,1,0,1,1,0] 



3.7 Encoding Signed Integers 

To encode signed integers one can map positive numbers to even numbers and 
strictly negative numbers to odd numbers. This gives the Encoder: 

type Z — Integer 
z : : Encoder Z 

z — compose (Iso z2nat nat2z) nat 

nat2z n = if even n then n 'div' 2 else (-n-1) 'div' 2 
z2nat n = if n<0 then -2*n-l else 2*n 

working as follows: 

*ISO> as set z (-42) 
[0,1,4,6] 

*ISO> as z set [0,1,4,6] 
-42 



3.8 Functional Binary Numbers 

Church numerals are well known as a functional representation for Peano arith- 
metic. While benefiting from lazy evaluation, they implement a form of unary 
arithmetic that uses 0{n) space to represent n. This suggest devising a func- 
tional representation that mimics binary numbers. We will do this following the 
model described in subsection |3.6| to provide an isomorphism between Nat and 
the functional equivalent of the regular language {0, 1}*. We will view each bit 
as a Nat Nat transformer: 



b X = pred x — begin 

X = 2*x+0 — bit 

1 X = 2*x+l — bit 1 
e = 1 — end 

As the following example shows, composition of functions o and i closely parallels 
the corresponding bitlists: 

*ISO> b$i$o$o$i$i$o$i$i$i$i$e 
2008 

*ISO> as bits nat 2008 
[1,0,0,1.1,0,1,1.1,1] 

We can follow the same model with an abstract data type: 

data D = E I D I I D deriving (Eq,Ord, Show, Read) 
data B = B D deriving (Eq,Ord, Show, Read) 

from which we can generate functional bitstrings as an instance of a fold opera- 
tion: 

funbits2nat : : B ^ Nat 
fimbits2nat = bf old b o i e 

bfold fb fo f i fe (B d) = fb (dfold d) where 
df old E = fe 

dfold (0 x) = fo (dfold x) 
dfold (I x) = f i (dfold x) 

Dually, we can reverse the effect of the functions b, o, i, e as: 

b' X = succ X 
o' X I even x = x 'div' 2 
i' X I odd X = (x-1) 'div' 2 
e' = 1 

and define a generator for our data type as an unfold operation: 

nat2funbits : : Nat — > B 
nat2funbits = bunfold b' o' i' e' 

bunfold fb fo fi fe x = B (dunfold (fb x) ) where 
dunfold n | n=f e = E 

dunfold n | even n = (dunfold (fo n)) 
dimfold n | odd n = I (dunfold (fi n)) 

The two operations form an isomorphism: 

*ISQ> funbits2nat (B$I$0$0$I$I$0$I$I$I$I$E) 

2008 

*ISO> nat2funbits it 

B (I (0 (0 (I (I (0 (I (I (I (I E)))))))))) 
We can define our Encoder as follows: 



funbits : : Encoder B 

funbits = compose (Iso funbits2nat nat2funbits) nat 

Arithmetic operations can now be performed directly on this representation. 
For instance, one can define a successor function as: 

bsucc (B d) = B (dsucc d) where 
dsucc E = E 
dsucc (0 x) = I X 
dsucc (I x) = (dsucc x) 

Equivalently arithmetics can be borrowed from Nat: 

*ISO> bsucc (B$I$0$0$I$I$0$I$I$I$I$E) 
B (0 (I (0 (I (I (0 (I (I (I (I E)))))))))) 
*ISO> as nat funbits it 
2009 

*ISO> borrow (with nat funbits) 

succ (B$I$0$0$I$I$0$I$I$I$I$E) 
B (0 (I (0 (I (I (0 (I (I (I (I E)))))))))) 
*ISO> as nat funbits it 
2009 

While Haskell's C-based arbitrary length integers are likely to be more efficient 
for most operations, this representation, like Church numerals, has the benefit 
of supporting partial or delayed computations through lazy evaluation. 

4 Generic Unranking and Ranking Hylomorphisms 

The ranking problem for a family of combinatorial objects is finding a unique 
natural number associated to it, called its rank. The inverse unranking problem 
consists of generating a unique combinatorial object associated to each natural 
number. 

4.1 Pure Hereditarily Finite Data Types 

The unranking operation is seen here as an instance of a generic anamorphism 
mechanism (an unfold operation) , while the ranking operation is seen as an in- 
stance of the corresponding catamorphism (a fold operation) |9ll0j . Together 
they form a mixed transformation called hylomorphism. We will use such hylo- 
morphisms to lift isomorphisms between lists and natural numbers to isomor- 
phisms between a derived "self-similar" tree data type and natural numbers. In 
particular we will derive Ackermann's encoding from Hereditarily Finite Sets to 
Natural Numbers. 

The data type representing hereditarily finite structures will be a generic 
multi-way tree with a single leaf type [] . 

data T = H [T] deriving (Eq.Qrd, Read, Show) 



The two sides of our hylomorphism are parameterized by two transformations f 
and g forming an isomorphism I so f g: 

unrank f n = H (unranks f (f n) ) 
unranks f ns = map (unrank f ) ns 

rank g (H ts) — g (ranks g ts) 
ranks g ts = map (rank g) ts 

Both combinators can be seen as a form of "structured recursion" that propagate 
a simpler operation guided by the structure of the data type. For instance, the 
size of a tree of type T is obtained as: 

tsize = rank (Axs^l + (sum xs)) 

Note also that unrank and reink work on T in cooperation with imreinks and 
ranks working on [T]. 

We can now combine an anamorphism+catamorphism pair into an isomor- 
phism hylo defined with rank and unrank on the corresponding hereditarily 
finite data types: 

hylo : : Iso b [b] — > Iso T b 

hylo (Iso f g) = Iso (rank g) (unrank f) 

hylos : : Iso b [b] -> Iso [T] [b] 

hylos (Iso f g) = Iso (ranks g) (unranks f) 

Hereditarily Finite Sets Hereditarily Finite Sets will be represented as an 
Encoder for the tree type T: 

hf s : : Encoder T 

hfs = compose (hylo nat_set) nat 

The hfs Encoder can now borrow operations from sets or natural numbers as 

follows: 

hf s_union = borrow2 (with set hfs) union 
hf s_succ = borrow (with nat hfs) succ 
hf s_pred = borrow (with nat hfs) pred 

*ISO> hfs.succ (H []) 
H [H []] 

*ISO> hf s.union (H [H []] ) (H []) 
H [H []] 

Otherwise, hylomorphism induced isomorphisms work as usual with our embed- 
ded transformation language: 

*ISO> as hfs nat 42 

H [H [H []],H [H [],H [H []]].H [H [] ,H [H [H []]]]] 
*ISO> as hfs nat 2008 

H [H [H [],H [H []]],H [H [H [H []]]], H [H [H []], 



H [H [H []]]], H [H [],H [H []],H [H [H []]]], 

H [H [H [],H [H []]]], H [H [] ,H [H [] ,H [H []]]], 

H [H [H []],H [H [],H [H []]]]] 

One can notice that we have just derived as a "free algorithm" Ackermann's 
encoding |llll2j from Hereditarily Finite Sets to Natural Numbers: 

fix) = if X = {} then else J^aex 2-^^"^ 

together with its inverse: 

ackermann — as nat hfs 
inverse_ackermann = as hfs nat 

One can represent the action of a hylomorphism unfolding a natural number into 
a hereditarily finite set as a directed graph with outgoing edges induced by by 
applying the inverse_ackermaim function as shown in Fig. l4j 




Fig. 4: 2008 as a HFS 



Hereditarily Finite Functions The same tree data type can host a hylomor- 
phism derived from finite functions instead of finite sets: 

hf f : : Encoder T 

hff = compose (hylo nat) nat 

The hff Encoder can be seen as another "free algorithm", providing data com- 
pression/succinct representation for Hereditarily Finite Sets. Note, for instance, 
the significantly smaller tree size in: 



*ISO> as hff nat 42 

H [H [H []],H [H []],H [H [] ] ] 

ISO> as hff nat 2008 

H [H [H [],H []],H [],H [H []],H [] ,H [] ,H [] ,H [] ] 

As the cognoscenti might observe this is explained by the fact that hff pro- 
vides higher information density than hf s, by incorporating order information 
that matters in the case of a sequence and is ignored in the case of a set. One 
can represent the action of a hylomorphism unfolding a natural number into a 
hereditarily finite function as a directed ordered multi-graph as shown in Fig. [5] 
Note that as the mapping as fun nat generates a sequence where the order of 
the edges matters, this order is indicated integers starting from labeling the 
edges. 




Fig. 5: 2008 as a HFF 



It is also interesting to connect sequences and HFF directly - in case one 
wants to represent giant "sparse numbers" that correspond to sequences that 
would overflow memory if represented as natural numbers but have a relatively 
simple structure as formulae used to compute them. We obtain the Encoder: 

hff s : : Encoder T 

hffs = Iso hff2fun fun2hff 

fun2hff ns = H (map (as hff nat) ns) 
hff2fun (H hs) — map (as nat hff) hs 

which can be used to generate HFFs associated to very large numbers: 
*ISO> as hffs fun [2-65, 2^131] 

H [H [H [H [],H [H [],H [H []]]]], H [H [H [] ,H [] ,H [H [] ,H [H []]]]]] 



4.2 Hereditarily Finite Multisets 



In a similar way, one can derive an Encoder for Hereditarily Finite Multisets 
based on either the mset or the pmset isomorphisms: 

nat_mset = Iso nat2mset mset2nat 
hfm : : Encoder T 

hfm — compose (hylo nat_mset) nat 
nat_pmset — Iso nat2pmset pmset2nat 
hf pm : : Encoder T 

hfpm = compose (hylo nat_pmset) nat 
working as follows: 
*ISO> as hfm nat 2008 

H [H [H [],H []],H [H [],H []],H [H [H [H []]]], H [H [H [H []]]], 

H [H [H [H []]]], H [H [H [H []]]]. H [H [H [H []]]]] 
*ISO> as nat hfm it 
2008 

*ISO> as hfpm nat 2008 

H [H [H [],H []],H [H [],H []],H [H [H [] ,H [H []]]]] 

*ISO> as nat hfpm it 

2008 

After implementing this encoding some Google search revealed that it is essen- 
tially the same as [13] where it appears as an encoding of rooted trees. 

4.3 A Hylomorphism with Atoms/Urelements 

A similar construction can be carried out for the more practical case when Atoms 
{Urelements in Set Theory parlance) are present. Hereditarily Finite Sets with 
Urelements are represented as generic multi-way trees with a leaf type holding 
urelements / atoms : 

data UT a = A a | F [UT a] deriving (Eq,Ord, Read, Show) 

Atoms will be mapped to natural numbers in [0. .ulimit-1]. Assuming for 
simplicity that ulimit is fixed, we denote this set A and denote UT the set of 
trees of type UT with atoms in A. 

Unranking As an adaptation of the unfold operation, natural numbers will be 
mapped to elements of UT with a generic higher order function unrsinkU f , 
defined from Nat to UT, parameterized by the natural number ulimit and the 
transformer function f : 

ulimit — 4 



LinrankU = tmrankUL ulimit 
imranksU = iinranksUL ulimit 



unrankUL 1 _ n | n>0 && n<l = A n 
unrankUL 1 f n = F (unranksUL 1 f (f (n-1) ) ) 

unranksUL 1 f ns = map (unrankUL 1 f) ns 

Ranking Similarly, as an adaptation of fold, a generic inverse mapping rankU is 
defined as: 

rankU = rankUL ulimit 
ranksU = ranksUL ulimit 

rankUL 1 _ (An) | n > && n<l = n 

rankUL 1 g (F ts) = l+(g (ranksUL 1 g ts)) 

rauksUL 1 g ts = map (rankUL 1 g) ts 

where rankU g maps trees to numbers and ranksU g maps lists of trees to lists 
of numbers. 

The following proposition describes conditions under which rankU and unrankU 

can be used to lift isomorphisms between [Nat] and Nat to isomorphisms in- 
volving hereditarily finite structures: 

Proposition 6 // the transformer function f : Nat [Nat] is a bijection 
with inverse g, such that n > ulimit A f{n) — [no, ...ni, ...Uk] =^ rii < n, then 
{unrankU /) : Nat UT is a bijection with inverse (rankU g) : UT 
Nat and the recursive computations defining both functions terminate in a finite 
number of steps. 

Proof. Note that unrankU terminates as its arguments strictly decrease at each 
step and rankU terminates as leaf nodes are eventually reached. That both are 
bijections, follows by induction on the structure of Nat and UT, given that map 
preserves bijections and that adding/subtracting ulimit ensures that encodings 
of atoms and sets never overlap. 

The resulting hylomorphisms are defined as previously: 

hyloU (Iso f g) = Iso (rankU g) (unrankU f) 
hylosU (Iso f g) = Iso (ranksU g) (unranksU f) 

An Encoder for Hereditarily Finite Sets with Urelements is defined as: 

uhf s : : Encoder (UT Nat) 

uhfs = compose (hyloU nat_set) nat 

Note that this encoder provides a generalization of Ackermann's mapping, to 
Hereditarily Finite Sets with Urelements in [0..u — 1] defined as: 

fu{x) = if X < u then x else u + J2aex S-^""^"' 



A similar Encoder for Hereditarily Finite Functions with Urelements is de- 
fined as: 

uhf f : : Encoder (UT Nat) 

uliff = compose (hyloU nat) nat 

4.4 Extending the encoding for the case of an infinite set of 
Atoms /Urelements 

An adaptation of the previous construction for the case when an infinite supply 
of atoms/urelements is needed (i.e. when their number is not known in advance) 
follows. 

Unranking As an adaptation of the unfold operation, natural numbers will be 
mapped to elements of UT with a generic higher order function unranklU f , 
defined from Nat to UT, parameterized by the transformer function f : 

imranklU _ n | even n = A (n 'div' 2) 

unranklU f n = F (unranksIU f (f ((n-1) 'div' 2))) 

unranksIU f ns = map (unranklU f) ns 

Note that (an infinite supply of) even numbers provides codes for atoms, while 
odd numbers are used to encode the non-leaf structure of the trees in UT. 

Ranking Similarly, as an adaptation of fold, a generic inverse mapping ranklU 

g is defined as: 

ranklU _ (A n) = 2*n 

ranklU g (F ts) = 1-1-2* (g (ranksIU g ts)) 
ranksIU g ts = map (ranklU g) ts 

where ranklU g maps trees to numbers and ranksIU g maps lists of trees to 
lists of numbers. 

The resulting hylomorphisms are defined as previously: 

hyloIU (ISO f g) = ISO (ranklU g) (unranklU f) 
hylosIU (Iso f g) = Iso (ranksIU g) (unranksIU f ) 

An Encoder for Hereditarily Finite Sets with an infinite supply of Urelements is 
defined as: 

iuhf s : : Encoder (UT Nat) 

iuhfs = compose (hyloIU nat_set) nat 

A similar Encoder for Hereditarily Finite Functions with and infinite supply 
of Urelements is defined as: 

iuhf f : I Encoder (UT Nat) 

iuhff = compose (hyloIU nat) nat 



5 Permutations and Hereditarily Finite Permutations 



Wc have seen that finite sets and their derivatives represent information in an 
order independent way, focusing exclusively on information content. We will 
now look at data representations that focus exclusively on order in a content 
independent way - finite permutations and their hereditarily finite derivatives. 

To obtain an encoding for finite permutations we will first review a rank- 
ing/unranking mechanism for permutations that involves an unconventional nu- 
meric representation, factoradics. 

5.1 The Factoradic Numeral System 

The factoradic numeral system |14j replaces digits multiplied by a power of a 
base n with digits that multiply successive values of the factorial of n. In the 
increasing order variant f r the first digit do is 0, the second is di G {0, 1} and 
the n-th is d„ G [0..n]. For instance, 42 = * 0! + * 1! + * 2! + 3 * 3! + 1 * 4!. 
The left-to-right, decreasing order variant f 1 is obtained by reversing the digits 
of fr. 

fr 42 

[0,0,0,3,1] 
rf [0,0,0,3,1] 

42 
fl 42 

[1,3,0,0,0] 
If [1,3,0,0,0] 

42 

The function f r generating the factoradics of n, right to left, handles the special 
case of and calls a local function f which recurses and divides with increasing 
values of n while collecting digits with mod: 

fr = [0] 
fr n = f In where 
f _ = [] 

f j k = (k 'mod' j) : 

(f (j+1) (k 'div' j)) 

The function f 1, with digits left to right is obtained as follows: 

fl = reverse . fr 

The function If (inverse of f l) converts back to decimals by summing up results 
while computing the factorial progressively: 

rf ns = sum (zipWith (*) ns factorials) where 
f actorials=scanl (*) 1 [1..] 

Finally, If, the inverse of f 1 is obtained as: 

If = rf . reverse 



5.2 Ranking and unranking permutations of given size with Lehmer 
codes and factoradics 

The Lehmer code of a permutation / of size n is defined as the sequence /(/) = 
(hif) ■ ■ ■ k{f) ■ ■ ■ Inif)) where kif) is the number of elements of the set {j > 
i\fU) < /(»)} 115J. 

Proposition 7 The Lehmer code of a permutation determines the permutation 
uniquely. 

The function perm2nth computes a rank for a permutation ps of size>0. It starts 
by first computing its Lehmer code Is with perm21ehiner. Then it associates a 
unique natural number n to Is, by converting it with the function If from 
factoradics to decimals. Note that the Lehmer code Ls is used as the list of 
digits in the factoradic representation. 

perm2nth ps — (l,lf Is) where 
ls=perm21ehiiier ps 
l=genericLength Is 

perm21elimer [] — [] 

perm21elimer (i:is) = 1 : (perm21ehiner is) where 
l=genericLength [j|j^is,j<i] 

The function nat2perm provides the matching unranking operation associat- 
ing a permutation ps to a given size>0 and a natural number n. It generates 
the n-th permutation of a given size. 

nth2perm (size.n) = 

apply_lehmer2perm (zs4-fxs) [0..size-l] where 
xs=fl n 

l=genericLength xs 
k=size-l 

zs=genericReplicate k 

The following function extracts a permutation from a "digit" list in factoradic 
representation. 

apply_lehmer2perm [] [] = [] 
apply_lehmer2perm (n:ns) ps@(x:xs) = 

y : (apply_lehmer2perm ns ys) where 

(y,ys) — pick n ps 

pick i xs = (x,ys-H-zs) where 

(ys,(x:zs)) = genericSplitAt i xs 

Note also that apply _lehmer2perm is used this time to reconstruct the permuta- 
tion ps from its Lehmer code, which in turn is computed from the permutation's 
factoradic representation. 

One can try out this bijective mapping as follows: 



nth2perm (5,42) 

[1,4,0,2,3] 
perm2nth [1,4,0,2,3] 

(5,42) 
nth2perm (8,2008) 

[0,3,6,5,4,7,1,2] 
perm2nth [0,3,6,5,4,7,1,2] 

(8,2008) 



5.3 A bijective mapping from permutations to natural numbers 

Like in the case of BDDs, one more step is needed to to extend the mapping 
between permutations of a given length to a bijective mapping from/to Nat: 
we will have to "shift towards infinity" the starting point of each new bloc of 
permutations in Nat as permutations of larger and larger sizes are enumerated. 

First, we need to know by how much - so we compute the sum of all factorials 
up to n\. 

sf n = rf (genericReplicate n 1) 

This is done by noticing that the factoradic representation of [0,1,1,..] does just 
that. 

What we are really interested into, is decomposing n into the distance to the 
last sum of factorials smaller than n, n_m and the its index in the sum, k. 

to_sf n = (k,n-m) where 

k=^red (head [x | x-<— [0 . . ] , sf :e>rL] ) 

np^sf k 

Unranking of an arbitrary permutation is now easy - the index k determines the 
size of the permutation and n-m determines the rank. Together they select the 
right permutation with nth2perm. 

nat2perm = [] 

nat2perm n = nth2perm (to_sf n) 

Ranking of a permutation is even easier: we first compute its size and its rank, 
then we shift the rank by the sum of all factorials up to its size, enumerating 

the ranks previously assigned. 

perm2nat ps = (sf l)+k where 
(l,k) = perm2nth ps 

It works as follows: 

nat2perm 2008 
[0,2,3,1,4] 
perm2nat [0,2,3,1,4] 

42 

nat2perm 2008 

[1,4,3,2,0,5,6] 
perm2nat [1,4,3,2,0,5,6] 

2008 



We can now define the Encoder as: 
perm : : Encoder [Nat] 

perm = compose (Iso perm2nat nat2perm) nat 
The Encoder works as follows: 

*ISO> as perm nat 2008 
[1,4,3,2,0,5,6] 
*ISO> as nat perm it 
2008 

*ISO> as perm nat 1234567890 
[1,6,11,2,0,3,10,7,8,5,9,4,12] 
*ISO> as nat perm it 
1234567890 



5.4 Hereditarily Finite Permutations 

By using the generic unrank and ramk functions defined in section |4] we 
extend the isomorphism defined by nat2perm and perm2nat to encodings 
Hereditarily Finite Permutations (HFP). 

nat2hfp = unrank nat2perm 
hfp2nat = rank perm2nat 

The encoding works as follows: 
*ISO> nat2hfp 42 

H [H [],H [H [],H [H []]],H [H [H []],H [] ] , 

H [H []],H [H [],H [H []],H [H [] .H [H []]]]] 
*ISO> hfp2nat it 
42 

We can now define the Encoder as: 
hfp : : Encoder T 

hfp — compose (Iso hfp2nat nat2hfp) nat 
The Encoder works as follows: 
*ISO> as hfp nat 42 

H [H [],H [H [],H [H []]],H [H [H []],H [] ] , 

H [H []],H [H [],H [H []],H [H [] .H [H []]]]] 
*ISO> as nat hfp it 
42 

*JFISO> as hfp nat 2008 

H [H [H []],H [H [],H [H []],H [H [] ,H [H []]]], H [H [H []],H []], 
H [H [],H [H []]],H [],H [H [] ,H [H [] ,H [H []]],H [H [] ] ] , 
H [H [H []],H [],H [H [],H [H []]]]] 

*ISO> as nat hfp it 

2008 



Fig. 6: 2008 as a HFP 



As shown in Fig |6] an ordered digraph (with labels starting from representing 
the order of outgoing edges) can be used to represent the unfolding of a natural 
number to the associated hereditarily finite permutation. An interesting prop- 
erty of graphs associated to hereditarily finite permutations is that moving from 
a number n to its successor typically only induces a reordering of the labeled 
edges, as shown in Fig. [7] 

6 Hereditary base-k represenations and Goodstein 
sequences 

Definition 1 Hereditary base-k representation of a number x is obtained by rep- 
resenting X as a sum oj powers of k followed by expression of each of the exponents 
with nonzero coeficients as a sum of powers of k, recursively. 

First we express a single step of this transformation to/from a polynomial in 
base k as a pair of bijections: 

nat2kpoly k n = filter (Ap^O/=^st p) ps where 
ns=to_base k n 
l=genericLength ns 
is=[0. .1-1] 
ps=zip ns is 



Fig. 7: 2009 as a HFP 



kpoly2nat k ps = sxm (map (A(d,e)— »d.*k*e) ps) 
The transformation works as follows: 

*ISO> nat2kpoly 3 2009 

[(2,0), (1,2), (2. 3). (2, 5), (2, 6)] 

*ISO> kpoly2nat 3 it 

2009 

The recursive process generates a tree, with coeficients of each expansion labeling 
nodes. We can host this expansion in the data type HB: 

data HB a = HB a [HB a] deriving (Eq,Ord, Show, Read) 

We will define, for each base k, two isomorphisms nat2hb k and hb2nat k be- 
tween natural numbers and polynomials: 

nat2hb : : Nat— »Nat— » [HB Nat] 

nat2hb _k = [] 

nat2hb k n ] n<k = [HB n []] 

nat2hb k n = gs where 

ps'=aat2kpoly k n 

gs=map (nat2hbl k) ps' 

nat2hbl k (d,e) = HB d (nat2hb k e) 



hb2nat : : Nat — > [HB Nat] — » Nat 
hb2nat k [] = 

hb2nat k ts = kpoly2nat k ps where 
ps^^ap (hb2iiatl k) ts 
hb2natl k (HB d ts) = (d,hb2nat k ts) 

We can now define a family of Encoders, one for each base k, as follows: 
hb : : Nat— >Encoder [HB Nat] 

hb k = compose (Iso (hb2nat k) (rLat2hb k)) nat 

The new concept here is working with a parametric family of Encoders. With a 
small adaptation, the syntax of the as combinator scales up naturally: 

*ISO> as (hb 3) nat 42 

[HB 2 [HB 1 []],HB 1 [HB 2 []],HB 1 [HB 1 [HB 1 []]]] 

*ISO> as nat (hb 3) it 

42 

Note that the base does not occur as such in the hereditary base-k expression 
obtained with the Encoder hb. This property can be used to obtain Goodstein 
sequences by bumping the base from k to k+1 i.e. interpreting a (hb k) ex- 
pression as a (hb (k+1)) expression and then subtracting 1 from the result, 
i.e: 

goodsteinStep k n = (hb2nat (k+1) (nat2hb k n)) - 1 
goodsteinSeq _ = [] 

goodsteinSeq k n = n: (goodsteinSeq (k+1) m) where 
iip=goodsteinStep k n 

goodstein m = goodsteinSeq 2 m 

*ISO> goodstein 3 
[3,3,3,2,1] 

*ISO> take 12 (goodstein 4) 
[4,26,41,60,83,109,139,173,211,253,299,348] 

Goodstein's Theorem (provable in second order arithmetics) states that this 

sequence always terminates at 0. The remarkable thing about it is that it is 
an undecidable statement in first order Peano arithmetics, that in contrast to 
Godel's therorem, involves only "conventional" numerical relations. 

7 Pairing/Unpairing 

A pairing function is an isomorphism / : Nat x Nat Nat. Its inverse is called 
unpairing. 



7.1 The Pepis-Kalmar-Robinson Pairing Function 

An classic pairing function is pepisJ, together with its left and right unpairing 
companions pepis JC and pepis Jl; that have been used, by Pepis, Kalmar and 
Robinson together with Cantor's functions, in some fundamental work on recur- 
sion theory, decidabihty and Hilbert's Tenth Problem in 1161171181191201211221231^ . 
The function pepis _J combines two numbers reversibly by multiplying a power 
of 2 derived from the first and an odd number derived from the second: 

f{x,y)^2-*{2*y + l)-l (6) 

Its Haskell implementation, together with its inverse is: 
pepis_J X y = pred ((2~x)*(succ (2*y))) 

pepis_K n — two_s (succ n) 

pepis_L n — (pred (no_two_s (succ n))) 'div' 2 

two_s n I even n — succ (two_s (n 'div' 2)) 
two_s _ = 

no_two_s n = n 'div' (2~(two_s n) ) 

This pairing function (slower in the second argument) works as follows: 

pepis_J 1 10 
41 

pepis_J 10 1 
3071 

[pepis_J i j |i^[0. .3] ,j^[0. .3]] 

[0,2,4,6,1,5,9,13,3,11,19,27,7,23,39,55] 

As Haskell provides a built-in ordered pair, it is convenient to regroup the func- 
tions J , K, L (given in Julia Robinson's original notation) as mappings to/from 
built-in ordered pairs: 

pepis_pair (x,y) — pepis_J x y 
pepis_unpair n — (pepis_K n,pepis_L n) 

Observing that the number of Os in front of the representation of a natural 
number n as a sequence equals pepis_K n, an alternative implementation could 
be: 

pepis_pair' (x,y) = (fun2nat (x:(nat2fun y)))-l 

pepis_unpair ' n= (x,fun2nat ns) where 
(x:ns)=nat2fun (n-|-l) 



fun2nat = set2nat . fun2set 
nat2fun = set2fuii . nat2set 



Note also that pepis_uiipair is "asymmetrical" in the sense that its first com- 
ponent grows much slower than the second, when applied to [0 . . ] . Sometimes 
it is more useful to have the opposite behavior 

rpepis_pair (x,y) = pepis_pair (y,x) 
rpepis_unpair n = (y,x) where (x,y)=pepis_unpair n 

After defining 

type Nat2 = (Nat, Nat) 

we obtain the encoder 

pnat2 : : Encoder Nat2 

pnat2 = compose (Iso pepis_pair pepis_unpair) nat 
rpnat2 : : Encoder Nat 2 

rpnat2 — compose (Iso rpepis_pair rpepis_unpair) nat 



7.2 A Bitwise Pairing/Unpairing Function 

We will now introduce an unusually simple pairing function (also mentioned in 
ES], p. 142). 

The function bitpair works by splitting a number's big endian bitstring 
representation into odd and even bits, while its inverse bitunpair blends the 
odd and even bits back together. 

bitpair : : Nat2 Nat 
bitpair (i,j) = 

set2nat ((evens i) 4+ (odds j)) where 

evens x — map (2*) (nat2set x) 

odds y = map succ (evens y) 

bitunpair : : Nat^Nat2 
bitunpair n = (f xs,f ys) where 

(xs,ys) = partition even (nat2set n) 

f = set2nat . (map ('div' 2)) 

The transformation of the bitlists is shown in the following example with 
bitstrings aligned: 

*ISO> bitunpair 2008 
(60,26) 

— 2008: [0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1] 
60: [0, 0, 1, 1, 1, 1] 
26:[ 0, 1, 0, 1, 1 ] 

We can derive the following Encoder: 

nat2 : : Encoder Nat2 

nat2 = compose (Iso bitpair bitunpair) nat 



working as follows: 



*ISO> as nat2 nat 2008 
(60,26) 

*ISO> as nat nat2 (60,26) 
2008 

In a way similar to hereditarily finite trees generated by unfoldings one 
can apply strictly decreasing unpairing functions recursively. Figures [s] and 
[9] show the directed graphs describing recursive application of bitunpair and 
pepis_unpair. 




Fig. 8: Graph obtained by recursive application of pepis_unpair for 2008 



Given that unpairing functions are bijections from Nat to Nat x Nat they 
will progressively cover all points having natural number coordinates in their 
range in the plane. Figures 10 11 show the curves generated by bitunpair and 
pepis_unpair. 



Fig. 12 shows the action of the pairing function bitpair on its two arguments 
arguments in [0..63]. 



^ except for and 1, typically 



Fig. 9: Graph obtained by recursive application of bitunpair for 2008 





"curvc.dal" using 1:2:3 




Fig. 12: Values of bitpair x y with x,y in [0..63] 



7.3 Encoding Unordered Pairs 



To derive an encoding of unordered pairs, i.e. 2 element sets, one can combine 
pairing/unpairing with conversion between sequences and sets: 

pair2unord_pair (x,y) — fun2set [x,y] 
\mord_pair2pair [a,b] = (x,y) where 
[x,y]=set2fuii [a,b] 

unord_unpair = pair2unord_pair . bitunpair 
imord.pair = bitpair . imord_pair2pair 

We can derive the following equivalent Encoders: 
set2 : : Encoder [Nat] 

set2 = compose (Iso unord_pair2pair pair2unord_pair) nat2 

that goes through iiat2, working as follows: 

*ISO> as set2 nat 2008 
[60,87] 

*ISO> as nat set2 it 
2008 

and 

set2 ' : I Encoder [Nat] 

set2' = compose (Iso imord_pair imord_unpair) nat 

that goes through nat, working as follows: 

*ISO> as set2' nat 2008 
[60,87] 

*ISO> as nat set2' [60,87] 
2008 

*ISO> as nat set2' [87,60] 
2008 



7.4 Encodings Multiset Pairs 

To derive an encoding of 2 element multisets, one can combine pairing/unpairing 
with conversion between sequences and multisets: 

pair2mset_pair (x,y) = (a,b) where [a,b]=f im2mset [x,y] 
mset_nnpair2pair (a,b) = (x,y) where [x,y]=mset2f\m [a,b] 

mset.nnpair = pair2mset_pair . bitvinpair 
mset.pair = bitpair . mset_unpair2pair 

We can derive the following Encoder: 

mset2 : : Encoder Nat2 

mset2 = compose (Iso mset_vinpair2pair pair2mset_pair) nat2 
working as follows: 



*ISO> as mset2 nat 2008 
(60,86) 

*ISO> as nat mset2 it 
2008 



Figure 13 shows the curve generated by mset.unpair covering the lattice of 
points in its range. 




Fig. 13: 2D curve connecting values of mset_unpair n for n £ [0..2^" — 1] 



7.5 Extending Pairing/Unpairing to Signed Integers 

Given the bijection from nat to z one can easily extend pairing/unpairing oper- 
ations to signed integers. We obtain the Encoder: 

type Z2 = (Z,Z) 
z2 : : Encoder Z2 

z2 = compose (Iso zpair zunpair) nat 

zpair (x,y) — (nat2z . bitpair) (z2nat x,z2nat y) 

zunpair z — (nat2z n,nat2z m) where (n,m)= (bitunpair . z2nat) z 



working as follows: 



*ISO> map zunpair [-5. .5] 

[(-1,1), (-2,-1), (-2,0), (-1,-1), (-1,0), (0,0), (0,-1), (1,0), (1,-1), (0,1), (0,-2)] 

*ISO> map zpair it 

[-5,-4,-3,-2,-1,0,1,2,3,4,5] 



*ISO> as z2 z (-2008) 
(63,-26) 

*ISO> as z z2 it 
-2008 



Figure [14] shows the curve covering the lattice of integer coordinates generated 
by the function zunpair. 




.20 I 1 1 1 1 1 1 1 1 

-40 -30 -20 -10 10 20 30 40 



Fig. 14: Curve generated by unpairing function on signed integers 



The same construction can be extended to multiset pairing functions: 
inz2 : : Encoder Z2 

mz2 = compose (Iso mzpair mzunpair) nat 

mzpair (x,y) = (nat2z . mset_pair) (z2nat x,z2nat y) 

mzunpair z = (nat2z n,nat2z m) where (n,m)= (mset_unpair . z2nat) z 

working as follows: 

*ISO> as mz2 z (-42) 



(1,-8) 

*ISO> as z mz2 it 
-42 



7.6 Gauss Integers and Pairing Functions 

Visualizing complex variable functions requires 4 dimensions even for 1-variable 
functions. This is usually handled by associating a color/hue value to the phase 
while representing the modulus along the z-axis. However, for 2-argument com- 
plex functions as simple as the sum, difference and the product 6 dimensions 
would be needed. Let us start shapeshifting operations on Gauss Integers (pairs 
of integers with a real and imaginary part) in combination with a mapping to 
ordinary integers using the (commutative!) multiset pairing/unpairing isomor- 
phism provided by the Encoder mz2: 

gauss_suiii (ab,cd) = mzpair (a-Hb.c+d) where 
(a,b)=mzunpair ab 
(c,d)=mzunpair cd 

gauss_dif (ab,cd) = mzpair (a-b,c-d) where 
(a,b)=mzunpair ab 
(c ,d)=mzunpair cd 

gauss_prod (ab.cd) = mzpair (a*c-b*d,b*c+a*d) where 
(a,b)=mzunpair ab 
(c ,d)=mzunpair cd 

Clearly one can now fit these operations in 3-dimensions as shown in Figures [TS} 
[16] [17] visualizing sums, differences and products of Gauss Integers obtained by 
unpairing integers in [—2^.. 2* — 1]. 

7.7 Some algebraic properties of pairing functions 

The following propositions state some simple algebraic identities between pairing 
operations acting on ordered, unordered and multiset pairs. 

Proposition 8 Given the function definitions: 

bitlift X— bitpair (x,0) 

bitlift' — (from_base 4-) ■ (to_base 2) 

bitclip = fst . bitunpair 

bitclip' — (from_base 2) . (map ('div' 2)) . (to_base J^.) . (*2) 

bitpair' (x,y) = (bitpair (x,0)) + (bitpair(0,y)) 
xbitpair (x,y) — (bitpair (x,0)) 'xor' (bitpair (0,y)) 
obitpair (x,y) — (bitpair (x,0)) . \ . (bitpair (0,y)) 

pair_product (x,y) = afb where 



Gauss Integer operations through Pairing Functions 

"curve.dat" using 1:2:3 




Fig. 15: Sums of Gauss Integers visualized with Pairing functions 



Gauss Integer operations through Pairing Functions 



"curve.dat" using 1:2:3 




16: Differences of Gauss Integers visualized with Pairing functions 



Gauss Integer operations through Pairing Funetions 



"eurve.dat" using ];2;3 




Fig. 17: Products of Gauss Integers visualized with Pairing functions 



x'—bitpair (x,0) 
y'—bitpair (0,y) 
ab=x '^y ' 

(a,b)=bitunpair ah 
the following identities hold: 

bitlift = bitlift' (7) 

bitclip = bitclip (8) 

bitclip o bitlift = id (9) 

bitpair{Q, n) = 2 * bitpair{n, 0) (10) 

bitpair{0, n) = 2* {bitlift n) (11) 

bitpair{n, n) = 3 * (bitlift n) (12) 

5iipmr(2",0) EE (2")2 (13) 

bitpair{2^" + 1,0) = 2^"^' + 1 (14) 

bitpair' = bitpair = xbitpair = obitpair (15) 

bitpair(x,y) = [bitlift a;) + 2 * {bitlift y) (16) 

pair_product = * (17) 



Proposition 9 Given the function definitions 
bitpair' ' (x,y) = mset_pair (min x y,xj-y) 

bitpair''' (x,y) = unord_pair [min x y,x{y^l] 

mset_pair' (a,b) = bitpair (min a b, (max a b) - (min a b)) 

mset_pair" (a,b) = unord_pair [min a b, (max a b)+l] 

unord_pair' [a,b] = bitpair (min a b, (max a b) - (min a b) -1) 

unord_pair' ' [a,b] = mset_pair (min a b, (max a b)-l) 
the following identities hold: 

bitpair = bitpair" = bitpair'" (18) 

msetjpair = msetjpair' = msetjpair" (19) 
unordjpair = unordjpair' = unordjpair" (20) 

8 Cons-Lists with Pairing/Unpairing 

The simplest application of pairing/unpairing operations is encoding of cons-lists 
of natural numbers, defined as the data type: 

data CList = Atom Nat | Cons CList CList 
deriving (Eq.Ord, Show, Read) 

First, to provide an infinite supply of atoms, we encode them as even num- 
bers: 

to_atom n = 2*n 

from.atom a | is_atom a = a 'div' 2 
is_atom n = even n && n>0 

Next, as we want atoms and cons cell disjoint, we will encode the later as odd 
numbers: 

is_cons n = odd n M n>0 

decons z | is_cons z = pepis_iinpair ((z-1) 'div' 2) 
cons X y = 2*(pepis_pair (x,y))-|-l 

We can deconstruct a natural number by recursing over applications of the 
unpairing-based decons combinator: 

nat2cons n | is_atom n = Atom (from_atom n) 
nat2cons n | is_cons n — 
Cons (nat2cons hd) 

(nat2cons tl) where 
(hd.tl) = decons n 



We can reverse this process by recursing with the cons combinator on the CList 
data type: 

cons2nat (Atom a) = to_atom a 

cons2nat (Cons h t) = cons (cons2nat h) (cons2nat t) 

The following example shows both transformations as inverses. 

*ISO> cons2nat (Cons (Atom 0) (Cons (Atom 1) (Cons (Atom 2) (Atom 3)))) 
26589 

*ISO> nat2cons 26589 

Cons (Atom 0) (Cons (Atom 1) (Cons (Atom 2) (Atom 3))) 
We obtain the Encoder: 
clist : I Encoder CList 

clist = compose (Iso cons2nat nat2cons) nat 

The Encoder works as follows: 

*ISO> as clist nat 101 

Cons (Atom 0) (Cons (Atom 0) (Atom 3)) 

and can be used to generate random LISP-like data and code skeletons from 
natural numbers. 

9 Revisiting Multiset Encodings 

We will now use pairing/unpairing functions, in combination with mappings to 
sequences and sets to design an efficient encoding of multisets. 

The function finset2nat starts by grouping the elements of a multiset. The 
lengths of the groups (decremented by 1), as well as an element of each are then 
collected in 2 lists. Then the second list is morphed from a set to a sequence, as 
this provides a more compact representation without changing the length of the 
list. The first list, seen as a sequence is then paired element by element with the 
second list. Finally, the resulting numbers, seen as a sequence, are then fused 
together. 

fmset2nat pairingf ms = m where 

mss= group (sort ms) 

xs=map (pred . genericLength) mss 

zs=map head mss 

ys=set2fun zs 

ps=^ip xs ys 

ns^^ap pairingf ps 

nrfun2nat ns 

The function f nat2mset reverses the process step by step; 

fnat2mset unpairingf m = rs where 
ns=aat2fun m 
ps=^ap xmpairingf ns 
(xs,ys)=unzip ps 



xs'=map succ xs 
zs=^un2set ys 

f k X = genericTake k (repeat x) 
rs = concat (zipWith. f xs' zs) 

After instantiating these generic functions to interesting pairing/unpairing func- 
tions 

bmset2nat = fmset2nat bitpair 
nat2bmset = fnat2mset bitunpair 

bmset2iiat' = finset2iiat pepis_pair 
nat2bmset' = fnat2mset pepis_impair 

We obtain the Encoders: 

bmset : : Encoder [Nat] 

bmset = compose (Iso bmset2iiat nat2bmset) nat 
bmset' : : Encoder [Nat] 

bmset' — compose (Iso bmset2nat' nat2bmset ' ) nat 

working as follows: 

ISO as bmset nat 2008 
[1,1,2,3,3.4,5,6,7] 
♦ISO as nat bmset it 
2008 

*ISO> map (as bmset nat) [0. .7] 

[[] , [0] , [0,0] , [0,1] , [1] , [0,1,1] , [0,0,1] , [0,1,2]] 

*ISO> as bmset' nat 2008 

[0,0,0,1,2,2,3,4,5,6] 

Note that, in contrast to the intractable prime number based multiset encoding 
pmset, this time we obtain an encoding, linear in the bitsize of the natural 
numbers involved, as in the c:as{^ of mset. Note also that the c;onstruction is 
generic in the sense that it works with any pairing / unpairing function. Like in 
the case of mset and pmset multiset encodings we can extend these encodings 
to a hylomorphism hf bm: 

nat_bmset = Iso nat2bmset bmset2nat 
hfbm : : Encoder T 

hfbm = compose (hylo nat_bmset) nat 
nat_bmset' = Iso nat2bmset' bmset2nat' 
hfbm' : : Encoder T 

hfbm' = compose (hylo nat.bmset') nat 
working as follows: 
*ISO as hfbm nat 42 

H [H [],H [],H [H []].H [H []],H [H [] ,H []],H [H [] .H []]] 



*ISO> as nat hfbm it 
42 

*ISCI> as hfbm' nat 2008 

H [H [],H [],H [],H [H []],H [H [] ,H []] , 

H [H [],H []],H [H [],H [H []]],H [H [H []]], 

H [H [],H [H []],H [H []]],H [H [] ,H [] ,H [H []]]] 

*ISO> as nat hfbm' it 

2008 



10 Pairing Functions and Encodings of Bineiry Decision 
Diagrams 

As a variation on the theme of pairing/unpairing func;tions. we will show in this 
section that a Binary Decision Diagram [BDD) representing the same logic func- 
tion as an n-variable 2" bit truth table can be obtained by applying bitunpair 
recursively to tt. More precisely, we will show that applying this unfolding op- 
eration results in a complete binary tree of depth n representing a BDD that 
returns tt when evaluated applying its boolean operations. 

The binary tree type BT has the constants BO and Bl as leaves representing 
the boolean values and 1. Internal nodes (that will represent if -then-else 
decision points), will be marked with the constructor D. We will also add integers 
to represent logic variables, ordered identically in each branch, as first arguments 
of D. The two other arguments will be subtrees that represent THEN and ELSE 
branches: 

data BT a = BO I Bl I D a (BT a) (BT a) 
deriving (Eq.Ord, Read, Show) 

The constructor BDD wraps together a tree of type BT and the number of logic 

variables occurring in it. 

data BDD a = BDD a (BT a) deriving (Eq.Ord, Read, Show) 
10.1 Unfolding natural numbers to binary trees 

The following functions apply bitunpair recursively, on a Natural Number tt, 
seen as an n-variable 2" bit truth table, to build a complete binary tree of depth 
n, that we will represent using the BDD data type. 

unf old_bdd : : Nat2 BDD Nat 
unfold_bdd (n,tt) = BDD n bt where 

bt=^f tt<max then split.with bitunpair n tt 
else error 

("unfold_bdd: last arg "-H- (show tt)-|-|- 
" should be < " -H- (show max)) 
where max = 2~2"n 



split_with _ n I n<l = BO 
split_with _ n 1 I n<l — Bl 
split_with f n tt = D k (split_with f k ttl) 

(split_with f k tt2) where 

k^^red n 
(ttl,tt2)^ tt 

The following examples show results returned by unf old_bdd for the 2^ truth 
tables associated to n variables, for n ~ 2: 

BDD 2 (D 1 (D BO BO) (D BO BO)) 
BDD 2 (D 1 (D Bl BO) (D BO BO)) 
BDD 2 (D 1 (D BO BO) (D Bl BO)) 

BDD 2 (D 1 (D Bl Bl) (D Bl Bl)) 

Note that no boolean operations have been performed so far and that we still 
have to prove that such trees actually represent BDDs associated to truth tables. 

10.2 Folding binary trees to natural numbers 

One can "evaluate back" the binary tree of data type BDD, by using the pairing 
function bitpair. The inverse of unf old_bdd is implemented as follows: 

f old_bdd : : BDD Nat Nat2 
fold_bdd (BDD n bt) = 

(n,fuse_with bitpair bt) where 
fuse_with rf BO = 
fuse_with rf Bl = 1 
fuse_with rf (D _ 1 r) = 

rf (fuse_with rf l,fuse_with rf r) 

Note that this is a purely structural operation and that integers in first argument 
position of the constructor D are actually ignored. 
The two bijections work as follows: 

*ISO>unfold_bdd (3,42) 
BDD 3 
(D 2 

(D 1 (D BO BO) 

(D BO BO)) 

(D 1 (D Bl Bl) 

(D Bl BO))) 

*ISO>fold_bdd it 
42 



10.3 Boolean Evaluation of BDDs 

Practical uses of BDDs involve reducing them by sharing nodes and eliminating 
identical branches [26]. Note that in this case bdd2nat might give a different 



result as it computes different pairing operations. Fortunately, we can try to 
fold the binary tree back to a natural number by evaluating it as a boolean 
function. 

The function eval.bdd describes the BDD evaluator: 
eval_bdd (BDD n bt) = eval_with_mask (bigone n) n bt 

eval_with_mask m _ BO = 
eval_with_mask m _ Bl = m 
eval_with_mask m n (D x 1 r) = 
ite_ (var_mn m n x) 

(eval_with_mask m n 1) 

(eval_with_mask m n r) 

var_mn mask n k = mask 'div' (2~ (2" (n-k-1) )+l) 
bigone nvars = 2~2~nvars - 1 

The projection functions varjnn can be combined with the usual bitwise 
integer operators, to obtain new bitstring truth tables, encoding all possible 
value combinations of their arguments, as shown in [27]. Note that the constant 
evaluates to while the constant 1 is evaluated as 2^ — 1 by the function 
bigone. 

The function ite_ used in eval_with_mask implements the boolean function 
if X then t else e using arbitrary length bitvector operations: 

ite_ X t e = ((t 'xor' e).&.x) 'xor' e 

As the following example shows, it turns out that boolean evaluation eval_bdd 
faithfully emulates f old_bdd.' 

*ISO> unfold_bdd (3,42) 

BDD 3 (D 2 (D 1 (D BO BO) (D BO BO)) 

(D 1 (D Bl Bl) (D Bl BO))) 
*ISO> eval_bdd it 
42 

10.4 The Equivalence 

We will now state the surprising (and new!) result that boolean evaluation and 
structural transformation with repeated application of pairing produce the same 
result: 

Proposition 10 The complete binary tree of depth n, obtained by recursive ap- 
plications o/bitunpair on a truth table tt computes an (unreduced) BDD, that, 
when evaluated, returns the truth table, i.e. 

foldMd o unfoldMd = id (21) 



evalMd o unfoldMd = id 



(22) 



Proof. The function unf old_bdd builds a binary tree by splitting the bitstring 
tt e [0..2" — 1] up to depth n. Observe that this corresponds to the Shannon 
expansion [28J of the formula associated to the truth table, using variable order 
[n — 1, 0]. Observe that the effect of bitunpair is the same as 

— the effect of var_mn m n (n-1) acting as a mask selecting the left branch, 
and 

— the effect of its complement, acting as a mask selecting the right branch. 

Given that 2" is the double of 2"^^, the same invariant holds at each step, as 
the bitstring length of the truth table reduces to half. 

We can thus assume from now on, that the BDD data type defined in section 
[T0| actually represents BDDs mapped one-to-one to truth tables given as nat- 
ural numbers. An interesting application of this result would be to investigate 
practical uses of bitpair/bitunpair operations in actual circuit design. 

11 Ranking and Unranking of BDDs 

One more step is needed to extend the mapping between BDDs with n variables 
to a bijective mapping from/to Nat: we will have to "shift towards infinity" the 
starting point of each new bloct^of BDDs in Nat as BDDs of larger and larger 
sizes are enumerated. 

First, we need to know by how much - so we will count the number of boolean 
functions with up to n variables. 

bsum 0=0 

bsum n | ii>0 = bsuml (n-1) 
bsuml = 2 

bsuml n | n>0 = bsuml (n-l)+ 2'2~n 

The stream of all such sums can now be generated as usuaQ 
bsums = map bsum [0 . . ] 

*ISO> genericTake 7 bsums 

[0 , 2 , 6 , 22 , 278 , 65814 , 4295033110] 

What we are really interested into, is decomposing n into the distance n-m 
to the last bsum m smaller than n, and the index that generates the sum, k. 

to_bsum n = (k.n-m) where 

k=pred (head [x | x<— [0 . . ] ,bsum x>n] ) 
m=i)sum k 

^ defined by the same number of variables 

* bsums is sequence A060803 in The On-Line Encyclopedia of Integer Sequences , 'http : 




Unranking of an arbitrary BDD is now easy - the index k determines the number 
of variables and n-m determines the rank. Together they select the right BDD 
with unf old_bdd and bdd. 

nat2bdd n — unfold_bdd (k,n_m) where (k,ii_m)=to_bsum n 

Ranking of a BDD is even easier: we shift its rank within the set of BDDs with 
nv variables, by the value (bsum nv) that counts the ranks previously assigned. 

bdd2nat bdd@(BDD nv _) = (bsum nv)+tt where 
(_,tt) ^old_bdd bdd 

As the following example shows bdd2nat implements the inverse of nat2bdd. 
*ISO> nat2bdd 42 

BDD 3 (D 2 (D 1 (D BO Bl) (D Bl BO)) 

(D 1 (D BO BO) (D BO BO))) 
*ISO> bdd2nat it 
42 

This provides the Encoder: 

pbdd : : Encoder (BDD Nat) 

pbdd = compose (Iso bdd2nat nat2bdd) nat 

working as follows: 
*ISO> as pbdd nat 2008 

BDD 4 (D 3 (D 2 BO (D 1 (D BO Bl) Bl)) 

(D 2 (D 1 (D Bl Bl) BO) (D 1 BO Bl))) 
*ISO> as nat pbdd it 

2008 

We can now repeat the ranking function construction for eval_bdd: 

ev_bdd2nat bdda(BDD nv _) = (bsum nv)+(eval_bdd bdd) 

We can confirm that ev_bdd2nat also acts as an inverse to nat2bdd: 

*ISO> ev_bdd2nat (nat2bdd 2008) 
2008 

We obtain the Encoder: 

bdd : : Encoder (BDD Nat) 

bdd = compose (Iso ev_bdd2nat nat2bdd) nat 

working as follows: 
*ISO> as bdd nat 2008 

BDD 4 (D 3 (D 2 (D 1 (D BO BO) (D BO BO)) 

(D 1 (D BO Bl) (D Bl BO))) 
(D 2 (D 1 (D Bl Bl) (D BO BO)) 

(D 1 (D BO BO) (DO Bl BO)))) 
*ISO> as nat bdd it 
2008 

This result can be seen as an intriguing isomorphism between boolean, arithmetic 
and symbolic computations. 



11.1 Reducing the BDDs 



We will sketch here a simplified reduction mechanism for BDDs eliminating 
identical branches. As nodes of a BDD are mapped bijcctivcly to unique natural 
numbers wc will omit the (trivial) implementation of nock^ sharing, with the 
implicit assumption that subtrees having the same encoding are shared. 

The function bdd_reduce reduces a BDD by collapsing identical left and 
right subtrees, and the function bdd associates this reduced form to n € Nat. 

bdd.reduce (BDD n bt) = BDD n (reduce bt) where 
reduce BO = BO 

reduce Bl = Bl 

reduce (D _ 1 r) | 1 = r = reduce 1 

reduce (D v 1 r) = D v (reduce 1) (reduce r) 

imf old_rbdd = bdd.reduce . unf old_bdd 

The results returned by unf old_rbdd for n=2 are: 
BDD 2 (C 0) 

BDD 2 (D 1 (D (C 1) (C 0)) (C 0)) 
BDD 2 (D 1 (C 0) (D (C 1) (C 0))) 
BDD 2 (D (C 1) (C O) 

BDD 2 (D 1 (D (C 0) (C 1)) (C D) 
BDD 2 (C 1) 

We can now define the unranking operation on reduced BDDs 
nat2rbdd = bdd_reduce . nat2bdd 
and obtain the Encoder 
rbdd : : Encoder (BDD Nat) 

rbdd = compose (Iso ev_bdd2nat nat2rbdd) nat 
working as follows 
*ISO> as rbdd nat 2008 

BDD 4 (D 3 (D 2 BO (D 1 (D BO Bl) (D Bl BO))) 

(D 2 (D 1 Bl BO) (D 1 BO (D Bl BO)))) 
*ISO> as nat rbdd it 

2008 

To be able to compare its space complexity with other representations we 
will define a size operation on a BDD as follows: 

bdd_size (BDD _ t) = l+(size t) where 
size BO = 1 
size Bl = 1 

size (D _ 1 r) = l+(size l)+(size r) 

This measures the size of the BDD or reduced BDD as an expression tree. To 
take into account sharing (as present in a standard ROBDD implementation) 
one can simply eliminate duplicated subtrees: 



robdd_size (BDD _ t) = l+(rsize t) where 
rsize = genericLength . nub . rbdd_nodes 
rbdd_nodes BO = [BO] 
rbdd_nodes Bl = [Bl] 
rbdd_nodes (D v 1 r) = 

[(D V 1 r)] -H- (rbdd_nodes 1) -H- (rbdd_nodes r) 



12 Generalizing BDD ranking/unranking functions 

12.1 Encoding BDDs with Arbitrary Variable Order 

While the encoding built around the equivalence described in Prop. [TO] between 
bitwise pairing/unpairing operations and boolean decomposition is arguably as 
simple and elegant as possible, it is useful to parametrize BDD generation with 
respect to an arbitrary variable order. This is of particular importance when 
using BDDs for circuit minimization, as different variable orders can make circuit 
sizes flip from linear to exponential in the number of variables [26^ . 

Given a permutation of n variables represented as natural numbers in [0..n— 1] 
and a truth table tt e [0..2^ — 1] we can define: 

to_bdd vs tt I < tt && tt < m = 
BDD n (to_bdd_mn vs tt m n) where 
n=genericLength vs 
nrf5igone n 
to_bdd _ tt = error 

("bad arg in to_bdd=>" -H- (show tt)) 

where the function to_bdd_mn recurses over the list of variables vs and ap- 
plies Shannon expansion [28], expressed as bit vector operations. This computes 
branches /I and /O, to be used as then and else parts, when evaluating back 
the BDD to a truth table with if-the-else functions. 

to_bdd_mn [] = BO 

to_bdd_mn [] _ _ _ = Bl 

to_bdd_mn (v:vs) ttmn=Dvlr where 

con(t=var_mn m n v 

fO= (m 'xor' cond) .&. tt 

fl= cond .&. tt 

l=to_bdd_mn vs fl m n 

i^o_bdd_mn vs fO m n 

Proposition 11 The function to_bdd builds an (unreduced) BDD correspond- 
ing to a truth table tt for variable order vs that returns tt when evaluated as a 
boolean function. 

We can reduce the resulting BDDs, and convert back from BDDs and reduced 
BDDs to truth tables with boolean evaluation: 



to_rbdd vs tt — bdd_reduce (to_bdd vs tt) 
from_bdd bdd = eval_bdd bdd 

We can obtain BDDs and reduced BDDs of various sizes as follows: 

*ISO> as perm nat 5 
[0,2,1] 

*ISO> to_bdd (as perm nat 5) 42 

BDD 3 (D (D 2 (D 1 BO BO) (D 1 Bl Bl)) 

(D 2 (D 1 BO BO) (D 1 Bl BO))) 
*ISO> to_rbdd (as perm nat 5) 42 
BDD 3 (D (D 2 BO Bl) (D 2 BO (D 1 Bl BO))) 
*ISO> to_rbdd (as perm nat 8) 42 
BDD 3 (D 2 BO (D Bl (D 1 Bl BO))) 
ISQ> from_bdd it 
42 

Finally, we can, obtain a minimal BDD expressing a logic function of n variables 
given as a truth table as follows: 

to_min_bdd n t = search_bdd min n t 

search_bdd f n tt = snd $ foldll f 
(map (sized_rbdd tt) (all_permutations n) ) where 
sized_rbdd tt vs = (robdd_size b,b) where 
b=to_rbdd vs tt 

all_permutations n = if n=0 then [[]] else 

[nth2perm (n,i) | i^ [0. . (factorial n)-l]] where 
factorial n==foldll (*) [l..n] 

As the following examples show, this can provide an effective multilevel boolean 
formula minimization up to functions with 6-7 arguments. 

*ISO> to_min_bdd 3 42 

BDD 3 (D (D 2 BO Bl) (D 1 (D 2 BO Bl) BO)) 
*ISO> to_min_bdd 4 2008 

BDD 4 (D 3 (D 1 (D BO Bl) (D Bl BO)) 

(D 2 (D 1 (D BO Bl) BO) (D Bl BO))) 
*ISO> to_min_bdd 7 2008 
BDD 7 (D (D 1 (D 2 (D 6 

(D 4 (D 3 BO Bl) (D 3 Bl BO)) 
(D 5 (D 4 (D 3 BO Bl) BO) 

(D 3 Bl BO))) BO) BO) BO) 
*ISO> robdd_size it 
12 



12.2 Multi- Terminal Binary Decision Diagrams (MTBDD) 

MTBDDs |29l30j are a natural generalization of BDDs allowing non-binary val- 
ues as leaves. Such values are typically bitstrings representing the outputs of a 
multi-terminal boolean function, encoded as unsigned integers. 



We shall now describe an encoding of MTBDDs that can be extended to 



ranking/unranking functions, in a way similar to BDDs as shown in section 11 
Our MTBDD data type is a binary tree like the one used for BDDs, parameter- 
ized by two integers m and n, indicating that an MTBDD represents a function 
from [0..n— 1] to [0..m— 1], or equivalently, an n-input/m-output boolean func- 
tion. 

data MT a = L a | Ma (MT a) (MT a) deriving (Eq.Ord, Read, Show) 
data MTBDD a = MTBDD a a (MT a) deriving (Show.Eq) 

The function tojntbdd creates, from a natural number tt representing a 
truth table, an MTBDD representing functions of type N ^ M with M = 
[0..2™ — l],iV = [0..2" — 1]. Similarly to a BDD, it is represented as binary tree 
of n levels, except that its leaves are in [0..2'" — 1]. 

to_mtbdd m n tt = MTBDD m n r where 
mlimit=2~m 
nlimit=2~n 

ttlimit=mlimit "nlimit 
i^if tt<ttlimit 

then (to_mtbdd_ mlimit n tt) 
else error 

("bt: last arg "-|-f (show tt)+\- 

" should be < " -l-f (show ttlimit)) 

Given that correctness of the range of tt has been checked, the function to_iiitbdd_ 
applies bitmerge_unpair recursively up to depth n, where leaves in range [0..mlimit- 
1] are created. 

to_intbdd_ mlimit n tt | (n<l)&&(tt<mlimit) = L tt 
to_mtbdd_ mlimit n tt = (M k 1 r) where 

(x,y)^itunpair tt 

k=pred n 

l=^o_mtbdd_ mlimit k x 
i^o_mtbdd_ mlimit k y 

Converting back from MTBDDs to natural numbers is basically the same thing 
as for BDDs, except that assertions about the range of leaf data are enforced. 

from_mtbdd (MTBDD m n b) = from_mtbdd_ (2"m) n b 

from_mtbdd_ mlimit n (L tt) | (n<l)M(tt<jnlimit)=tt 
from_mtbdd_ mlimit n (M _ 1 r) = tt where 
fc^pred n 

x=^rom_mtbdd_ mlimit k 1 
y=^rom_mtbdd_ mlimit k r 
tt=^itpair (x,y) 

The following examples show that tojntbdd and f romjntbdd are indeed inverses 
values in [0..2" - 1] x [0..2'" - 1]. 



>to_mtbdd 3 3 2008 
MTBDD 3 3 



(M 2 
(M 1 

(M (L 2) (L D) 
(M (L 2) (L 1))) 
(M 1 

(M (L 2) (L 0)) 
(M (L 1) (L 1)))) 

>f rom_mtbd.d it 
2008 



Sprint (to_mtbcl<i 2 2) 
MTBDD 2 2 

(M 1 (M (L 0) (L 
MTBDD 2 2 

(M 1 (M (L 1) (L 
MTBDD 2 2 

(M 1 (M (L 0) (L 
MTBDD 2 2 

(M 1 (M (L 1) (L 



[0..3] 

0)) (M (L 0) (L 0))) 

0)) (MO (L 0) (L 0))) 

0)) (M (L 1) (L 0))) 

0)) (M (L 1) (L 0))) 



13 Revisiting Encodings of Finite Functions 

We will now generalize the bitpair pairing function to fc-tuples and then we 
will derive an alternative encoding for finite functions. 

13.1 Tuple Encodings as Generalized Bitpair 

The function to_tuple : Nat Nat'' converts a natural number to a A;-tuple 
by splitting its bit representation into k groups, from which the k members in 

the tuple are finally rebuilt. This operation can be seen as a transposition of a 
bit matrix obtained by expanding the number in base 2*^: 

to_tuple k n = map (from_base 2) ( 
transpose ( 

map (to_maxbits k) ( 
to_base (2"k) n 

) 

) 

) 

To convert a fc-tuple back to a natural number we will merge their bits, fc at a 
time. This operation uses the transposition of a bit matrix obtained from the 
tuple, seen as a number in base 2*^, with help from bit crunching functions given 
in APPENDIX: 

from.tuple ns = from.base (2*k) ( 
map (from_base 2) ( 
transpose ( 



map (to_maxbits 1) ns 

) 

) 

) where 

k=genericLength ns 
l=max_bitcount ns 

The following example shows the decoding of 42, its decomposition in bits (right 
to left), the formation of a 3-tuple and the encoding of the tuple back to 42. 

*ISO> to_base 2 42 

[0,1,0,1,0,1] 
*ISO> to_tuple 3 42 
[2,1,2] 

*ISO> to_base 2 2 
[0,1] 

*ISO> to_base 2 1 
[1] 

*ISO> from_tuple [2,1,2] 
42 

Fig. [18] shows multiple steps of the same decomposition, with shared nodes col- 
lected in a DAG. 




Fig. 18: Repeated 3-tuple expansions: 4^ and 2008 



13.2 Encoding Finite Functions as Tuples 

As finite sets can be put in a bijection with an initial segment of Nat, a finite 
function can be seen as a function defined from an initial segment of Nat to 
Nat. We can encode and decode a finite function from [0..fc — 1] to Nat (seen as 
the list of its values), as a natural number: 



ftuple2nat [] = 

ftuple2nat ns — succ (pepis_pair (pred k,t)) where 
l5=geiiericLength ns 
•fc=f rom_tuple ns 

nat2ftuple = [] 

nat2ftuple kf = to_tuple (succ k) f where 
(k,f )=pepis_unpair (pred kf) 

As the length of the tuple, k, is usually smaller than the number obtained by 
merging the bits of the k-tuple, we have picked the Pepis pairing function, expo- 
nential in its first argument and linear in its second, to embed the length of the 
tuple needed for the decoding. This suggest the following alternative Encoder 
for finite functions: 

fun' : : Encoder [Nat] 

fun' = compose (Iso ftuple2nat nat2ftuple) nat 
as well as the related alternative hylomorphism: 
nat_fun' = Iso nat2ftuple ftuple2nat 

hf f ' : : Encoder T 

hff = compose (hylo nat_fun') nat 

The encoding/decoding and the hylomorphism work as follows: 

*ISO> as fun' nat 2008 
[3,2,3,1] 

*ISO> as nat fim' it 
2008 

*ISO> as hff nat 2008 

H [H [H [H []]],H [H [],H []].H [H [H []]].H [H []]] 

*ISO> as nat hff it 

2008 



14 Directed Graphs, Undirected graphs, Multigraphs 
and Hypergraphs 

Wc will now show that more complex data types like digraphs and hypergraphs 
have extremely simple encoders. This shows once more the importance of com- 
positionality in the design of our embedded transformation language. 

14.1 Encoding Directed Graphs 

Wc can find a bijcction from directed graphs (with no isolated vertices, corre- 
sponding to their view as binary relations), to finite sets by fusing their list of 
ordered pair representation into finite sets with a pairing function: 



digraph2set ps — map bitpair ps 
set2digraph ns — map bitunpair ns 

The resulting Encoder is: 

digraph : : Encoder [Nat2] 

digraph = compose (Iso digraph2set set2digraph) set 

working as follows: 

*ISO> as digraph nat 2008 

[(1,1), (2,0), (2,1), (3,1), (0,2), (1,2), (0,3)] 

*ISO> as nat digraph it 

2008 

*ISO> as rbdd digraph [(1 , 1) , (2 , 0) , (2 , 1) , 

(3,1), (0,2), (1,2), (0,3)] 
BDD 4 (D 3 (D 2 BO (D 1 (D BO Bl) (D Bl BO))) 
(D 2 (D 1 Bl BO) (D 1 BO (D Bl BO)))) 

Fig. [T9| shows the digraph associated to 2008. 




14.2 Encoding Undirected Graphs 

We can find a bijection from undirected graphs to finite sets by fusing their 
list of unordered pair representation into finite sets with a pairing function on 
unordered pairs: 

graph2set ps = map unord_pair ps 
set2graph ns = map unord_unpair ns 

The resulting Encoder is: 





Fig. 19: 2008 as a digraph 



graph : : Encoder [ [Nat] ] 

graph = compose (Iso graph2set set2graph) set 
working as follows: 
*ISO> as graph nat 2008 

[[1,3] , [2,3] , [2,4] , [3,5] , [0,3] , [1,4] , [0,4]] 

*ISO> as nat graph it 

2008 

*ISO> as nat graph 

[[1.3] , [3,2] , [2,4] . [5,3] , [0,3] , [4,1] , [0.4]] 

2008 

Note that, as expected, the result is invariant to changing the order of elements 
in pairs like [1,4] and [3,5] to [4,1] and [5,3]. 

14.3 Encoding Directed Multigraphs 

We can find a bijection from directed multigraphs (directed graphs with multiple 
edges between pairs of vertices) to finite sequences by fusing their list of ordered 
pair representation into finite sequences with a pairing function: 
The resulting Encoder is: 

mdigraph : : Encoder [Nat2] 

mdigraph = compose (Iso digraph2set set2digraph) fun 
working as follows: 

*ISO> as mdigraph nat 2008 

[(1,1), (0,0), (1,0), (0,0), (0,0), (0,0), (0.0)] 

*ISO> as nat mdigraph it 

2008 

Note that the only change to the digraph Encoder is replacing the composition 
with set by a composition with fun. 

14.4 Encoding Undirected Multigraphs 

We can find a bijection from undirected multigraphs (undirected graphs with 
multiple edges between unordered pairs of vertices) to finite sequences by fusing 
their list of pair representation into finite sequences with a pairing function on 
unordered pairs: 

The resulting Encoder is: 

mgraph : : Encoder [ [Nat] ] 

mgraph — compose (Iso graph2set set2graph) fun 

working as follows: 

*ISO> as mgraph nat 2008 

[[1,3] , [0,1] , [1,2] , [0,1] , [0,1] , [0,1] , [0.1]] 

*ISO> as nat mgraph it 

2008 



Note that the only change to the graph Encoder is replacing the composition 
with set by a composition with fun. 

14.5 Encoding Hypergraphs 

Definition 2 A hypergraph ( also called set system ) is a pair H = [X, E) where 
X is a set and E is a set of non-empty subsets of X. 

We can easily derive a bijective encoding of hypergraphs, represented as sets of 
sets: 

set2hypergraph — map nat2set 
hypergraph2set = map set2nat 

The resulting Encoder is: 

hypergraph : : Encoder [ [Nat] ] 

hypergraph = compose (Iso h3rpergraph2set set2hypergraph) set 

working as follows 

*ISO> as hypergraph nat 2008 

[[0,1] . [2] , [1,2] , [0.1,2] , [3] , [0,3] , [1,3]] 

*ISO> as nat hypergraph it 

2008 

15 Encoding SAT problems 

Boolean Satisfiability (SAT) problems are encoded as lists of lists representing 
conjunctions of disjunctions of positive or negative propositional symbols. 
After defining: 

set2sat = map (set2disj . nat2set) where 
shiftO z = if (z<0) then z else z+1 
set2disj = map (shiftO. nat2z) 

sat2set = map (set2nat . disj2set) where 
shiftbackO z — if(z<0) then z else z-1 
disj2set = map (z2nat . shiftbackO) 

we obtain the Encoder 

sat : : Encoder [ [Z] ] 

sat = compose (Iso sat2set set2sat) set 
working as follows: 
*ISO> as sat nat 2008 

[[1,-1] , [2] , [-1,2] , [1,-1,2] , [-2] , [1,-2] , [-1,-2]] 

*ISO> as nat sat it 

2008 

Clearly this encoding can be used to generate random SAT problems out of 
easier to generate random natural numbers. 



16 An Encoder for Graph Models 



Graph models |31l32j provide a semantics of A-calculus (Y-combinator included) 
in terms of sets of finite sets of natural numbers. Following 31j a graph model 
is a pair {D,p) where D is an infinite set and p : D* x D ^ D is a,n injective 
total function. We will strengthen this to be a bijection, for the case D — Nat 
as follows. 

gmodel2nat (set,m) = pred (fun2nat (m : (set2fun set))) 
nat2gmodel n = (fun2set xs,m) where (m:xs) — nat2fun (succ n) 

This provides the Encoder: 

type GdomaiiF ([Nat], Nat) 
gmodel : : Encoder Gdomain 

gmodel — compose (Iso gmodel2nat nat2gmodel) nat 

working as follows: 

*ISO> as gmodel nat 42 
([0,2,4] ,0) 

*ISO> as nat gmodel it 
42 

The interests of such models is that they provide an accurate set theoretic se- 
mantics for untyped lambda calculus describing key computational mechanisms 
like /3-conversion and fixpoint combinators. 



17 A mapping to a dense set: Dyadic Rationals in [0, 1) 

So far our isomorphisms have focused on natural numbers, finite sets and other 
discrete data types. Dyadic rationals are fractions with denominators restricted 
to be exponents of 2. They are a dense set in TZ i.e. they provide arbitrarily close 
approximations for any real number. An interesting isomorphism to such a set 
would allow borrowing things like distance or average functions that could have 
interesting interpretations in symbolic or boolean domains. It also makes sense 
to pick a bounded subdomain of the dyadic rationals that can be meaningful 
as the range of probabilistic boolean functions or fuzzy sets. We will build an 
Encoder for Dyadic Rationals in [0, 1) by providing a bijection from finite sets 
of natural numbers seen this time as negative exponents of 2. 

dyadic : : Encoder (Ratio Nat) 

dyadic — compose (Iso dyadic2set set2dyadic) set 



The function set2dyadic mimics set2iiat defined in subsection 3.3 except for 
the use of negative exponents and computation on rationals. 

set2dyadic : : [Nat] Ratio Nat 
set2dyadic ns = rsum (map nexp2 ns) where 
nexp2 = 17.2 

nexp2 n — (nexp2 (n-l))*(l°/,2) 



rsum [] = 07. 1 

rsum (x:xs) — x+(rsuni xs) 

The function dyadic2set extracts negative exponents of two from a dyadic 
rational and it is modeled after nat2set defined in subsection 13.31 

dyadic2set : : Ratio Nat [Nat] 

dyadic2set n | good_dyadic n = dyadic2exps n where 
dyadic2exps _ = [] 
dyadic2exps n x = 

if (d<l) then xs else (x:xs) where 
d = 2*n 

m = if d<l then d else (pred d) 
xs=dyadic2exps m (succ x) 
dyadic2set _ = 

error "dyadic2set: argument not a dyadic rational" 

As not all rational numbers are dyadics in [0, 1), the predicate good_dyadic is 
needed validate the input of dyadic2set. This also ensures that dyadic2set 
always terminates returning a finite set. 

good_dyadic kn = (k=0 && n=l) 

II ((kn>oy.l) && (kn<iy.l) && (is_exp2 n) ) where 
k^^umerator kn 
n=denominator kn 

is_exp2 1 = True 

is_exp2 n | even n — is_exp2 (n 'div' 2) 
is_exp2 n = False 

Some examples of borrow/lend operations are: 
dyadic_dist x y = abs (x-y) 

dist_for t X y = as dyadic t 

(borrow2 (with dyadic t) dyadic_dist x y) 
dsucc = borrow (with nat dyadic) succ 
dplus = borrow2 (with nat dyadic) (+) 

dconcat = lend2 dyadic (4+) 

*ISO> dist_for nat 6 7 
l"/.2 

*ISO> dist_for set [1,2,3] [3,4,5] 
217.64 

*ISO> dsucc (3"/.8) 
77.8 



Fig. 
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shows the dyadic rationals associated to natural numbers in [0..255]. 



Dyadics 




Fig. 20: Dyadic rationals associated to n in [0..255] 



18 Strings and Parenthesis Languages 



18.1 Encoding Strings 

As strings can be seen just as a notational equivalent of lists of natural numbers 
we obtain an Encoder immediately as: 

string : : Encoder String 

string — Iso string2fun fun2string 

string2fun cs — map (f romlntegral . ord) cs 

fun2string ns — map (chr . f romlntegral) ns 

Note however that this is only an isomorphism within the chr/ord conversion 
range, therefore we shall assume this constraint as a law governing this Encoder. 

*ISO> as set string "hello" 
[104,206,315,424,536] 
*ISO> as string set it 
"hello" 



18.2 Encoding a Parenthesis Language 

An encoder for a parenthesis language is obtained by combining a parser and 
writer. As Hereditarily Finite Functions naturally map one-to-one to a paren- 
thesis expression we will choose them as target of the transformers. 

pars : : Encoder [Char] 

pars = compose (Iso pars2hff hff2pars) hff 

The parser recurses over a string and builds a HFF as follows: 
pars2hff cs = parse_pars '(' ')' cs 

parse_pars 1 r cs | newcs ^= [] = t where 
(t,newcs)=pars_expr 1 r cs 

pcirs_expr 1 r (c:cs) | <^=1 = ((H ts) , newcs) where 
(ts, newcs) = pars_list 1 r cs 

pars_list 1 r (c:cs) | c==r = ([],cs) 
pcirs_list 1 r (c:cs) = ((t:ts),cs2) where 

(t,csl)=pars_expr 1 r (c:cs) 

(ts,cs2)=pars_list 1 r csl 

The writer recurses over a HFF and collects matching parenthesis pairs: 
hff2pars = collect _pars '(' ')' 

collect_pars 1 r (H ns) = 
[1]4+ 

(concatMap (collect_pars 1 r) ns) 
++[r] 

The transformations of 42 look as follows: 

*ISO> as pars nat 42 

"((())(())(()))" 

*ISO> as hff pars it 

H [H [H []],H [H []],H [H []]] 

*ISO> as nat hff it 

42 

Alternatively, by using a and 1 as left and right parenthesis we can define: 

bitpars2hf f cs = parse.peirs 1 cs 
hff2bitpars = collect_pars 1 

hf f _pars : : Encoder [Nat] 

hff_pars = compose (Iso bitpars2hff hff2bitpars) hff 
working as follows: 



*ISO> as hff_pars nat 2008 

[0,0,0,1,0.1,1,0,1.0,0,1,1.0.1,0,1,0.1,0,1,1] 
*ISO> as nat hff_pars it 



2008 

*ISO> as nat bits (as hff.pars nat 2008) 
7690599 

As the last example shows, the information density of a parenthesis representa- 
tion is lower. This is expected, given that order is constrained by balancing and 
content is constrained by having the same number of Os and Is. The following 

example 

*ISCI> map ((as nat bits) . (as h.ff_paxs nat)) [0..7] 
[5,27,119,115,495,483,471,467] 

shows that this application is injective only. Therefore a succinct representation 
of an abstract tree structure can be obtained by encoding it as a natural number 

as in: 

*ISO> as nat pars "((()())()(())()()()())" 
2008 

Note however, that 

*ISO> as nat bits (as hff.pars nat (2~2"16)) 
32639 

while the conventional representation of the same number would have a few 
thousand digits. This suggest defining: 

nat2parnat n = as nat bits (as hff_pars nat n) 

parnat2nat n = as nat hff.pars (as bits nat n) 

and find out that 

*ISO> [x|x<— [0. .2~16] ,nat2parnat x<x] 
[8192 , 16384 , 32768 , 32769 , 49152 , 65536] 

One can see that more compact representations only happen for a few numbers 
that are powers of two or "sparse" sums of powers of two. A good way to eval- 
uate "information density" for an arbitrary data type that is isomorphic to Nat 
through one of our encoders is to compute the total bitsize of its actual encoding 
over an interval like [0..2"~^]. For instance, 

hff_bitsize n= sum (map hff_bsize [0..2"n-l]) 
hff_bsize k=^enericLength (as bits nat (nat2parnat k)) 

Knowing that the optimal bit representation of all numbers in [0..2"~^] totals 

n * 2" (2" of them, n bits each), wc can define a measure of information density 
for a bit-encoded parenthesis language seen as a representation for HFF as: 

inf o_density_hf f n = (n*2~n)y, (hff _bitsize n) 

One can see that information density progressively increases to converge to a 
value above half of the "perfect" value of 1: 



*ISO> map info_density_hff [0..12] 

[07.1 , l"/.3 , 47.9 , 127.25 , 17.2 , 807.157 , 167.31 , 1 127.215 , 

327.61 , 487.91 , 10247.1933 , 28167.5297 , 20487.3841] 
*ISQ> map fromRational it 

[0.0,0. 3333333333333333 , . 4444444444444444 , . 48 , . 5 , 
. 5095541401273885 , . 5161290322580645 , . 5209302325581395 , 
. 5245901639344263 , . 5274725274725275 , . 5297465080186239 , 
. 5316216726448934 , . 5331944806040094] 

To compare this with the information density of hereditarily finite sets, mul- 
tisets and permutations, we can also map their structure to a bit-represented 
parenthesis language by defining the encoder: 

pars_hf=Iso bitpeirs2hff hff2bitpars 

hff_pars' :: Encoder [Nat] 
hff_pars' = compose pars_hf hff ' 

hf s_pars : : Encoder [Nat] 
hfs_pars = compose pars_hf hfs 

hf pm_pars : : Encoder [Nat] 
hfpm_pars — compose pars_hf hfpm 

hfm_pars : : Encoder [Nat] 
hfm_pars = compose pars_hf hfm 

bhfm.pars : : Encoder [Nat] 
bhfm.pars = compose pars_hf hfbm 

bhfm.pars' : : Encoder [Nat] 
bhfm.pars' = compose pars.hf hfbm' 

hfp_pars : : Encoder [Nat] 
hfp_pars = compose pairs_hf hfp 

and then defining: 

parsize_as t n = genericLength (hff2bitpars (as t nat n)) 
parsizes_to m t = map (peirsize.as t) [0..2"m-l] 
nat2hfsnat n = as nat bits (as hf s_pars nat n) 
hfs_bitsize n= sum (map hfs_bsize [0..2"n-l]) 
hfs_bsize k^genericLength (as bits nat (nat2hfsnat k)) 
inf o_density_hf s n = (n*2~n)7.(hf s_bitsize n) 

The intuition that hereditarily finite functions have higher information density 
than hereditarily finite sets can now be conjectured: 



*ISO> map info_density_hfs [0..12] 

[oy.l , l"/.3 , 27.5 , 37.8 , 17.3 , 57.16 , 27.7 , 77.27 , 47.17 , 37.13 , 27.9 , 117.52 , 17.5] 
*ISCI> map f romRational it 

[0.0,0. 3333333333333333 , . 4 , . 375 , . 3333333333333333 , 

. 3125 , . 2857142857142857 , . 25925925925925924 , . 23529411764705882 , 
. 23076923076923078 , . 2222222222222222 , . 21 153846153846154 ,0.2] 

Contrary to the case of bit-encoded HFFs, in this case information density is 
decreasing for larger values - an observation that can help with finding a simple 
proof for th{^ c;onj(x;tur(\ More generally, such techniques suggest applications to 
experimental mathematics. 

19 Self-delimiting codes 

A more precise estimate of the actual size of various bitstring representations re- 
quires also counting the overhead for "delimiting" their components. An asymp- 
totically optimal mechanism for this is the use of a universal self- delimiting code 
for instance, the Elias omega code. To implement it, the encoder proceeds by 
recursively encoding length of the string, the length of the length of the strings 
etc. 

to_elias : : Nat [Nat] 

to_elias n = (to_eliasx (succ n))-|-|-[0] 

to_eliasx 1 = [] 
to_eliasx n = xs where 

bs=^o_lbits n 
l=(genericLength bs)-l 

xs = if 1<2 then bs else (to.eliasx 1)-|-Fbs 

The decoder first rebuilds recursively the sequence of lengths and then the actual 
bitstring. It makes sense to design the decoder to extract the number represented 
by the self-delimiting code from a sequence/stream of bits and also return what 
is left after the extraction. 

from_elias :: [Nat] (Nat, [Nat]) 

from_elias bs = (pred ii,cs) where (n,cs)=from_eliasx 1 bs 

from.eliasx n (Orbs) = (n,bs) 
from_eliasx n (l:bs) = r where 

hs=genericTake n bs 

ts=^enericDrop n bs 

n'=f rom_lbits (l:hs) 

r=from_eliasx n' ts 

to_lbits = reverse . (to_base 2) 



from.lbits = (from_base 2) . reverse 
We obtain the Encoder: 



elias : : Encoder [Nat] 

elias = compose (Iso (fst . from_elias) to_elias) nat 

working as follows: 

*ISO> as elias nat 42 
[1,0,1,0,1.1,0.1,0,1,1,0] 
*ISO> as nat elias it 
42 

*ISO> as elias nat 2008 
[1,1,1,0,1,0,1,1,1,1,1,0,1,1,0,0,1,0] 
*ISO> as nat elias it 
2008 

Note that self-delimiting codes are not onto the regular language {0, 1}*, there- 
fore this Encoder cannot be used to map arbitrary bitstrings to numbers. 

20 Encoding DNA 

We have covered so far encodings for "artificial entities" used in various fields. 
We will now add an encoding of "natural origin" , DNA bases and strands. While 
it is an (utterly) simplified model of the real thing, it captures some essential 
algebraic properties of DNA bases and strands. 

We start with a DNA data type, following |33l34j : 

data Base — Adenine | Cytosine | Guanine | Thymine 
deriving (Eq , Ord , Show , Read) 

type DNA = [Base] 

We will encode/decode the DNA base alphabet as follows: 

alphabet2code Adenine — 
alphabet2code Cytosine — 1 
alphabet2code Guanine — 2 
alphabet2code Thymine — 3 

code2alphabet = Adenine 
code2alphabet 1 = Cytosine 
code2alphabet 2 = Guanine 
code2alphabet 3 = Thymine 

The mapping is simply a symbolic variant of conversion to/from base 4: 
dna2nat — (from_base 4) . (map alphabet2code) 

nat2dna = (map code2alphabet) . (to_base 4) 

We can now define a decoder for base sequences as follows: 
dna : : Encoder DNA 

dna = compose (Iso dna2nat nat2dna) nat 



A first set of DNA operations act on base sequences. Tlie transformation 
between complements looks as follows: 

dna_complement : : DNA DNA 
dria_coiiiplement = map to_compl where 

to.compl Adenine = Thymine 

to_compl Cytosine = Guanine 

to_compl Guanine = Cytosine 

to.compl Thymine = Adenine 

Reversing is just list reversal. 

dna_reverse : : DNA — » DNA 
dna_reverse — reverse 

As reversal and complement are independent operations their composition is 
commutative - we can pick reversing first and then complementing: 

dna_comprev : : DNA — » DNA 

dna_comprev = dna.complement . dna_reverse 

The following examples show interaction of DNA codes with other data types 

and their operations: 

*ISO> as dna nat 2008 

[Adenine , Guanine , Cytosine , Thymine , Thymine , Cytosine] 

*ISO> borrow (with dna nat) dna_reverse 42 

42 

*ISO> borrow (with dna nat) dna_reverse 2008 
637 

*ISO> borrow (with dna nat) dna_complement 2008 
2087 

*ISO> borrow (with dna nat) dna_comprev 2008 

3458 

*ISO> borrow (with dna bits) 

dna_comprev [1,0,1,0,1,1,0,1,0,1] 
[1,1,1,0,1,0,0,0,0,1,1] 

Note that each of these DNA operations induces a bijection Nat Nat. 

Like signed integers, DNA strands have "polarity" - their direction matters: 

data Polarity = P3x5 | P5x3 
deriving (Eq , Ord , Show , Read) 



data DNAstrand = DNAstrand Polarity DNA 
deriving (Eq , Ord , Show , Read) 

Polarity can be easily encoded as parity even/odd: 

strand2nat (DNAstrand polarity strand) = 

add_polarity polarity (dna2nat strand) where 
add_polarity P3x5 x = 2*x 
add.polcirity P5x3 x = 2*x-l 



nat2strand n = 
if even n 

then DNAstrand P3x5 (nat2dna (n 'div' 2)) 
else DNAstrand P5x3 (nat2dna ((n+l) 'div' 2)) 

We can now define an Encoder for DNA strands: 

dnaStrand : : Encoder DNAstrand 

dnaStrand — compose (Iso strand2nat nat2strand) nat 

Two additional operations lift DNA sequences to strands with polarities: 

dna_down : : DNA DNAstrand 

dna_down — (DNAstremd P3x5) . dna_complement 

dna_up : : DNA DNAstrand 
dna_up = DNAstrand P5x3 

We can now lend or borrow operations as follows: 
*ISO> as dnaStrand nat 1234 

DNA P3x5 [Cytosine, Guanine, Guanine, Cytosine, Guanine] 
*ISO> lend (with dnaStrand nat) succ 

(DNAstrand P5x3 [Adenine, Cytosine, Guanine, Thymine] ) 
DNAstrand P5x3 [Cytosine , Cytosine .Guanine , Thymine] 

The DoubleHelix is a stable combination of two complementary strands. This 
built-in redundancy protects against unwanted mutations. 

data DoubleHelix — DoubleHelix DNAstrand DNAstrand 
deriving (Eq , Ord , Show , Read) 

dna_double_helix : : DNA DoubleHelix 
dna_double_helix s = 

DoubleHelix (dna_up s) (dna_down s) 

We can now generate a double helix from a natural number: 

*ISO> dna_double_helix (nat2dna 33) 
DoubleHelix 

(DNAstrand P5x3 [Cytosine, Adenine, Guanine] ) 

(DNAstrand P3x5 [Guanine, Thymine, Cyt os ine] ) 

This can be used for generating random instances of double helixes by reusing 
a random generator for natural numbers. 

21 Testing It All 

We will now describe a random testing mechanism to validate our Encoders. 

While QuickCheck [2, provides an elegant general purpose random tester, it 
would require writing a specific adaptor for each isomorphism. We will describe 
here a shortcut through a few higher order combinators. 

First, we build a simple random generator for nat 



rannat = rand (2*50) 



rand : : Nat— >Nat— >Nat 
rand max seed = n where 

(n,g)=randomR (O.max) (mkStdGen (f romlntegral seed)) 

We can now design a generic random test for any Encoder as follows: 

rantest : : Encoder t— >Bool 

rantest t = and (map (rsintestl t) [0..255]) 

rantestl t n = x==(visit_as t x) where x=ramiat n 

visit_as t = (to nat) . (from t) . (to t) . (from nat) 

Note that in rantestl, visit_at starts with a random natural number from 
which it generates its test data of a given type. After testing the encoder, the 
result is brought back as a natural number that should be the same as the 
original random number. 

We can now implement our tester isotest that in a few seconds goes over 
of thousands of test cases and aggregates the result with a final and: 

isotest = and (map rt [0..25]) 

rt = rantest nat 

rt 1 = rantest fun 

rt 2 = rantest set 

rt 3 = rantest bits 

rt 4 = rantest funbits 

rt 5 = rantest hf s 

rt 6 = rantest hf f 

rt 7 = rantest uhf s 

rt 8 = rantest uhff 

rt 9 = rantest perm 

rt 10 = rantest hfp 

rt 11 = rantest nat2 

rt 12 = rantest set2 

rt 13 = rantest clist 

rt 14 = rantest pbdd 

rt 15 = rantest bdd 

rt 16 = rantest rbdd 

rt 17 = rantest digraph 

rt 18 = rantest graph 

rt 19 = rantest mdigraph 

rt 20 = rantest mgraph 

rt 21 = rantest hypergraph 

rt 22 = rantest dyadic 

rt 23 = rantest string 

rt 24 = rantest pars 

rt 25 = rantest dna 

The empirical correctness test of the "whole enchilada" follows: 



*ISO> isotest 
True 

sugg(-stiiig that the probability of having errors in the code described so far is 
extremely small. 

22 Applications 

Besides their utility as a uniform basis for a general purpose data conversion 
library, let us point out some specific applications of our isomorphisms. 

22.1 Combinatorial Generation 

A free combinatorial generation algorithm (providing a constructive proof of 
recursive enumerability) for a given structure is obtained simply through an 
isomorphism from nat: 

nth thing = as thing nat 
nths thing = map (nth thing) 
stream_of thing = nths thing [0 . . ] 

*ISO> nth set 42 
[1,3,5] 

*ISO> nth bits 42 
[1,1,0,1,0] 

*ISO> take 3 (stream.of hfs) 
[H [],H [H []],H [H [H []]]] 

*ISO> take 3 (stream.of bdd) 

[BDD BO, BDD B1,BDD 1 (D BO BO)] 



22.2 Random Generation 

Combining nth with a random generator for nat provides free algorithms for 
random generation of complex objects of customizable size: 

ran thing seed leirgest = head (random_gen thing seed largest 1) 

random_gen thing seed largest n = genericTake n 
(nths thing (rans seed largest)) 

rans seed largest = 

randomRs (0, largest) (mkStdGen seed) 

For instance 



*ISO> random_gen set 11 999 3 
[[0.2, 5], [0.5, 9], [0,1. 5, 6]] 



generates a list of 3 random sets. 
For instance 

*ISO>ran digraph 5 (2"31) 

[(1,0) , (0,1) , (2,1) , (1,3) , (2,2) , (3,2) , (4,0) , (4,1) , 
(5, 1) , (6,0) , (6, 1) , (7, 1) , (5,3) , (6,2) . (6,3)] 

*ISO> ran hfs 7 30 

H [H [],H [H [],H [H []]],H [H [H [H []]]]] 
*ISO> ran dnaStrand 1 123456789 

DNAstrand P5x3 [Gueoiine, Thymine, Guanine, Cytosine, 
Cytosine , Thymine , Thymine , Thymine , Thymine , 
Adenine , Thymine , Cytosine , Cytosine] 

generate a random digrapli, a liereditarily finite set and a DNA strand. 

Random generator for various data types are useful for further automating 
test generators in tools like QuickCheck [J by generating customized random 
tests. 

An interesting other application is generating random problems or programs 
of a given type and size. For instance 

*ISO> ran sat 8 (2~31) 

[[-1] . [1,-1] , [-1,2] , [1,-1,2] , [-2] , [1,-2] , [-1,-2] , [1,-1,-2] , 
[2,-2] , [1,2,-2] , [-1,2,-2] , [3] , [1,-1,3] , [1,-1,2,3] , [1,-2,3] , 
[-1,-2,3] , [2,-2,3] , [1,2,-2,3] , [-1,2,-2,3]] 

*ISO> ran clist 8 12345 

Cons (Atom 0) (Cons (Cons (Atom 0) (Atom 0)) (Atom 100)) 
generate, respectively, a random SAT- problem and a random Cons- list. 

22.3 Succinct Representations 

Depending on the information theoretical density of various data representations 
as well as on the constant factors involved in various data structures, significant 
data compression can be achieved by choosing an alternate isomorphic represen- 
tation, as shown in the following examples: 

*ISO> as hff hfs (H [H [H []],H [H [] , 

H [H []]],H [H [],H [H [H []]]]]) 
H [H [H []],H [H []],H [H []]] 

*ISO> as nat hff (H [H [H []],H [H []],H [H []]]) 
42 

*ISO> as fun bits [0,1,0,0,0,0,0,0,0,0,0] 
[0,10] 

*ISO> as rbdd hfs (H [H [] ,H [H [] ,H [H []]], 

H [H [H []],H [H [H []]]]]) 

BDD 3 (D 1 Bl BO) 

*ISO> as hff bdd (BDD 3 (D 2 

(D 1 (D Bl BO) (D BO Bl)) 



(D 1 (D Bl Bl) (D Bl Bl)))) 
H [H [],H [H [],H [],H []]] 



In particular, mapping to efficient arbitrary length integer implementations 
(usually C-based libraries), can provide more compact representations or im- 
proved performance for isomorphic higher level data representations. Alterna- 
tively, lazy representations as provided by functional binary numbers or BDDs, 
for very large integers encapsulating results of some computations might turn 
out to be more effective space-wise or time-wise. 

We can compare representations sharing a common datatype to conjecture 
about their asymptotic information density. 

22.4 Experimental Mathematics 

Comparing compactness of representations For instance, after defining: 

length_as t = fit genericLength (with nat t) 
suin_as t = fit sum (with nat t) 
size_as t — fit tsize (with nat t) 

one can conjecture that finite functions are more compact than permutations 
which are more compact than sets asymptotically 

*ISO> length. as set 123456789012345678901234567890 

54 

*ISO> length_as perm 123456789012345678901234567890 
28 

*ISCI> length.as fun 123456789012345678901234567890 
54 

*ISO> sum.as set 123456789012345678901234567890 
2690 

*ISO> sum.as perm 123456789012345678901234567890 
378 

*ISO> sum.as fun 123456789012345678901234567890 
43 

One might observe that the same trend applies also to their hereditarily finite 

derivatives: 

*ISQ> size_as hfs 123456789012345678901234567890 
627 

*ISO> size.as hfp 123456789012345678901234567890 
276 

*ISO> size.as hff 123456789012345678901234567890 
91 

While confirming or refuting this conjecture is beyond the scope of this paper, 
the affirmative case would imply, interestingly, that "order" (permutations) has 

asymptotically higher information density than "content" (sets), and explain 
why finite functions (that involve both) dominate data representations in various 
computing fields. 



Based on the same experiment, reduced BDDs (especially if one implements 
sharing, as computed by robdd_size) also provide relatively compact represen- 
tations: 

*ISO> bdd_size $ as bdd 

nat 123456789012345678901234567890 

256 

*ISO> bdd_size $ as rbdd 

nat 123456789012345678901234567890 

144 

*ISO> robdd_size $ as rbdd 

nat 123456789012345678901234567890 

39 



Figures [21] [22| [23] compare the sizes of bitstring, BDD, HFF, HFS, HFP rep- 
resentations, first with the most succinct ones (bitstring, BDDs, HFF) grouped 
together in Fig. 21 then the less succinct ones (HFS and HFP) in Fig. 22 and 
finally all representations together for n in the larger interval [0..2^^ — 1]. 



Bit, BDD and HFF sizes 




Fig. 21: Comparison of curvel=Bit, curve2=BDD and curve3=HFF sizes 



It is also interesting to observe the ability of some representations to express 
huge numbers that normally overfiow computer memory but which are genuinely 
"low complexity" as a result of a small numbers of simple computational steps 
that generate them. 



HPS and HFP sizes 
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Fig. 22: Comparison of curvcl=HFS and curvel^HFP sizes 



Bit, BDD, HFF, HPS, and Hp? sizes 




Fig. 23: Comparison of all representation sizes at a larger scale 



For instance, 

*ISQ> map (as nat pars) 

["()","(())","((()))","(((())))","((((()))))"."(((((())))))"] 

[0,1.2,4,16,65536] 
*ISO> as hff pars "((()))" 
H [H [H []]] 

shows that parenthesis sequences (structurally isomorphic to hereditarily finite 
functions) can represent succinctly the fast growing but low complexity series 
a„ = 2^ . Clearly, terms of the series would exhaust computer memory quite 
quickly using a conventional bitvector based arbitrary size integer representa- 
tion! This suggest the usefulness of a universal possibly lazy "shapeshifting" 
algorithm, that can decide on the most efficient data representation automati- 
cally, using size estimates, at the time when data is actually constructed. 

Sparseness criteria As a first step, one can introduce a "sparseness criteria" 
by comparing the size of a representation f with the size of the self-delimiting 

Elias omega code. 

One can obtain an encoding of such sequences by encoding its length and 
then encoding each term, parametrized by a function / : Nat — > [Nat] : 

nat2self f n = (to_elias 1) -H- concatMap to.elias ns where 
ns = f n 

l=genericLength ns 

nat2sfun n = nat2self (as fun nat) n 

This function is injective (but not onto!) and its action can be reversed by first 
decoding the length I and then extracting self delimited sequences I times. 

self2nat g ts — (g xs,ts') where 
(l,ns) = from_elias ts 
(xs ,ts ' )=take_f rom_elias 1 ns 

taJie_f rom_elias ns = ([],ns) 

take_f rom_elias k ns = ((x:xs) ,ns' ') where 

(x,ns')=froin_elias ns 

(xs ,ns ' ' )=take_f rom_elias (k-1) ns' 

sfun2nat ns = xs where 

(xs, [])=self2nat (as nat fim) ns 

We obtain the Encoder: 

sf^m : : Encoder [Nat] 

sfim = compose (Iso sfim2nat nat2sfim) nat 

working as follows: 

*ISO> as sfun nat 42 
[1,0,1,0.0,0,1,0,0,1,0,0.1,0,0] 



*ISO> as nat sfun it 
42 

A simple concept of sparseness is derived by comparing the size of a self- 
delimiting code for a number n vs. the size of its self-delimiting representation 
as a finite sequence, finite set or finite permutation as shown in Fig. |24| computed 
as follows: 

linear_sparseness_pair t n = 

(genericLength (to_elias n) ,genericLength (nat2self (as t nat) n) ) 

linear_sparseness f n = x/y where (x,y)=linear_sparseness_pair f n 



Linear Sparscncss 
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Fig. 24: Sparseness measures with curvel=fun, curve2=set, curve3=perm up to 
2' 



We can also extend this comparison the hereditarily finite representations, 
which, as a pleasant surprise, turn out to provide self-delimiting codes. 

sparseness_pair f n = 

(genericLength (to_elias n) , genericLength (as f nat n) ) 

sparseness f n = x/y where (x,y)=sparseness_pair f n 



One can then compare (self-delimiting) parenthesis language representations 
for hereditarily finite encoders provided by HFF, HFS, HFP and discover the 
"peaks" of sparseness as shown in Fig. [25] and [26] 




A new self-delimiting code While the HFF representation is generally less 

compact than Elias omega code, its simplicity suggest it as a possibly useful self- 
delimiting code, especially interesting for streams of "sparse" values, as shown 
in Fig. |27] 

One can collect values that have smaller HFF codes than Elias omega codes 
i.e. "sparse numbers" with: 

sparses_to m = [n | [0 . .m-1] , 

(genericLength (as hff_pars nat n) ) 

< 

(genericLength (as elias nat n))] 
working as follows 
ISD> sparses_to (2~11) 

[15,16,17,24,32,64,65,96,128,129,192,256,257,258,259,320,384, 
385,448,512,513,514,515,516,517,518,519,520,544,576,640,641,704,768, 



Recursive Sparseness 
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Fig. 26: Sparseness measures with curvel=HFF curve2=HFS, curve3=HFP up 
to 2" 



769 , 770 , 771 , 832 , 896 , 897 , 960 , 1024 , 1025 , 1026 , 1027 , 1028 , 1029 , 1030 , 1031 , 
1032 , 1088 , 1152 , 1280 , 1281 , 1408 , 1536 , 1537 , 1538 , 1539 , 1664 , 1792 , 1793 , 1920] 

and notice that the Hst collects an unusually large number of various popular 
memory chip and computer screen sizes. Figure [28] shows distribution of "sparse 
numbers" in [0..2i*^]. 

Primes and Pairing Functions Products of two prime numbers have the 
interesting property that they are special a case where no information is lost 
by multiplication in the sense of |35j . Indeed, in this case multiplication is re- 
versible, i.e. the two factors can be recovered given the product. As the product 
is comparatively easy to compute, while in case of large primes factoring is be- 
lieved intractable, this property has well-known uses in cryptography. Given the 
isomorphism between natural numbers and primes mapping a prime to its posi- 
tion in the sequence of primes, one can transport pairing/unpairing operations 
to prime numbers 

ppair pairingf (pl,p2) | is_prime pi && is_prime p2 — 

from_pos_in ps (pairingf (to_pos_in ps pl,to_pos_in ps p2)) where 
ps — primes 

punpair unpairingf p | is_prime p = (f rom_pos_iii ps nl , f rom_pos_iii ps n2) where 



Self-delimiting codes; Undelimited vs. Ellas vs, HIT 




ps=primes 

(nl ,n2)=unpairingf (to_pos_in ps p) 
working as follows: 

*ISO> ppair bitpair (11,17) 
269 

*ISO> punpair bitunpair it 
(11,17) 

Clearly, this defines a bijection / : Primes x Primes — > Primes that is tempting 
to compare with the product of two primes. Figs. [29] and [30] shows the surfaces 
generated by products and multiset pairings of primes. While both commutative 
operations are reversible and likely to be asymptotically equivalent in terms of 
information density, one can notice the much smoother transition in the case of 
lossless multiplication. 



Prime Multiplication 
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Fig. 29: Lossless multiplication of primes 



We have seen that recursive application of the unpairing function bitunpair 
provided an isomorphisms between natural numbers and BDDs. Given an un- 
pairing function u : Nat — > Nat x Nat and a predicate p(n) over the set of 
natural numbers, it makes sense to investigate subsets of Nat such that if p 
holds for n then it also holds after applying the unpairing function u to n. More 
interestingly, one can look at subsets for which this property holds recursively. 



Prime Multiset Pairing 
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Fig. 30: Lossless multiset pairing of primes 

Assuming a prime recognizer is_prime and a generator primes for the stream 
of prime numbers (see Appendix), we can define: 

hyper_primes u— [n | n^primes , all_are_primes (uparts u n)] where 
all_are_primes ns = and (map is_prime ns) 

uparts u = sort . nub . tail . (split_with u) where 
split_with _ = [] 
split_with _ 1 = [] 

split_with u n = n : (split_with u n0)4-f (split_with u nl) where 
(nO,nl)=u n 

working as follows: 

*ISO> take 20 (hyper_primes bitunpair) 

[2,3,5,7,11,13,17,19,23,29,31,43,47,59,71,79,83,89,103,139] 
*ISO> take 20 (hyper_primes pepis_unpair) 

[2,3,5,7,11,13,19,23,29,31,43,53,59,107,127,173,223,251,311,347] 

This leads to the following conjectures, in increasing order of generality: 

Conjecture 1 The sets generated by (hyper -primes bitpair) and (hyper_primes 
pepis_unpair) are infinite. 

Conjecture 2 If u. is a bijection from u : Nat — > Nat x Nat such that: 



1. if n > I and u n — (uq, ni) then hq < n and ni < n 

2. p is a predicate on Nat such that P = {n : p{n)} is infinite 

then the set P H {n : uparts u n\ is also infinite. 



Figure 31 shows the complete unpairing graph for two hyper-primes obtained 
with bitunpair. 




Fig. 31: mset_unpair hyper-primes: 1783 and 2109167 



It is interesting to compare the action of pairing of natural numbers with their 
action on functions on primes and hyper-primes with products. Clearly products 
are not reversible, except when numbers are primes, while pairing functions are 
always reversible. To factor in the fact that products commute while pairing 
functions do not, we have considered 2xy instead of xy. 

Figures [32] and [33] show this comparison. 



Primes vs. products 




Fig. 32: Pairing of primes vs. 2xy 



Hyper-primes vs. products 




Fig. 33: Pairing of hyper-primes vs. 2xy 



Hyper-primes and Fermat primes One could expect to model more closely 
the behavior of primes and products by focusing on commutative functions like 
the multiset pairing function mset_pair: 

*ISO> take 16 (hyper_primes mset_unpair) 

[2 , 3 , 5 , 13 , 17 , 1 13 , 173 , 257 , 10753 , 17489 , 34897 , 34961 , 43633 , 43777 , 65537 , 142781 101] 

We remind that: 

Definition 3 A Fermat-prime is a prime of the form 2^" + 1 with n > 0. 

Fig. [34] shows a hyper- prime that is also a Fermat prime and a hyper-prime that 
is not a Fermat prime. 




Fig. 34: mset_unpair hyper-primes: Fermat prime and Non-Fermat prime 



This time a more interesting conjecture emerges. We can now state that: 

Conjecture 3 All Fermat primes are mset_unpair induced hyper-primes. 

We will just observe that this would follow from the widely believed conjecture 
that there the only Fermat primes are [3,5,17,257,65537] as these 5 primes are 
indeed on our list of mset_unpair hyperprimes. 

In the event of the alternative, we will now state: 



Proposition 12 If there are Fermat primes other than [3,5,17,257,65537] then 
there are Fermat primes that are not mset_unpair hyper-primes. 

To prove Prop. [T2| we need a few additional results. First, the following known 
fact, implying that we only need to prove that there are primes of the form 
2^ +1 that are not hyper-primes. 

Lemma 1 If n > and 2" + 1 is prime then n is a power of 2. 
It is easy to prove, from the definition of mset_pair that: 
Lemma 2 

mset.pair (2^" + 1, 2^" + 1) = 2^"+' + 1 (23) 
Indeed, from the identity [T9] we obtain 

msetjpair{a, a) = bitpair{a,0) (24) 
and then observe that from [14] it follows that 

bitpair{2^" + 1,0) = 2^"^' + 1 (25) 



We can now prove Prop. 34 If 2^ + 1 is a Fermat prime that is also a hyper- 
prime, then 2^ -1-1 would be also a Fermat prime that is hyper-prime. This would 
form a descending sequence of consecutive Fermat primes - a contradiction, 
given that it has been proven (by Leonhard Eulcr in 1732) that for instance, 
232 + 1 = ^ 7QQ^ j^Q^ prime. 



22.5 A surprising "free algorithm": strange_sort 

A simple isomorphism like nat.set can exhibit interesting properties as a build- 
ing block of more intricate mappings like Ackermann's encoding, but let's also 
note a (surprising to us) "free algorithm" - sorting a list of distinct elements 
without explicit use of comparison operations: 

strange_sort = (from nat_set) . (to nat_set) 

*ISO> strange_sort [2,9,3,1,5,0,7,4,8,6] 
[0,1,2,3,4,5,6,7,8,9] 

This algorithm emerges as a consequence of the commutativity of addition and 
the unicity of the decomposition of a natural number as a sum of powers of 2. 
The cognoscenti might notice that such surprises are not totally unexpected in 
the world of functional programming. In a different context, they go back as 
early as Wadler's Free Theorems [35]. In a similar way, to sort sequences with 
repeated elements one can write 

strange_sort ' — (to mset) . (from mset) 
strange_sort ' ' — (as mset nat) . (as nat mset) 

*ISO> strange_sort ' [2,4,1,1,0,3,17,1.4] 
[0,1,1,1,2,3,4,4,17] 

*ISO> straiige_sort " [2,4,1,1,0,3,17,1,4] 
[0,1,1,1,2,3,4,4,17] 



22.6 Circuit Minimization 



Let us consider the classic problem of synthesizing a half adder, composed of an 
XOR (~) and an AND (*) function. We can combine the two functions with an if- 
then-else with selector variable A to obtain: ITE(A,B~C,B*C) with the following 
truth table: 

[0,0,0] :0 
[0,0,1] :0 
[0,1,0] :0 
[0,1,1] :1 
[1,0,0] :0 
[1,0,1] :1 
[1,1,0] :1 
[1,1,1] :0 

Note that this 3 argument single output function (encoded as the natural number 
22 by reading its value column in binary) , fuses the two operations with the upper 
half of the truth table representing the AND and the lower half representing the 
XOR. When running to_min_bdd on this function we obtain: 

ISO from_base 2 [0,1,1,0, 1,0,0,0] 
22 

*ISO> to_min_bdd 3 22 
BDD 3 (D 

(D 1 (D 2 BO Bl) (D 2 Bl BO)) 

(D 1 (D 2 Bl BO) BO)) 



22.7 Other Applications 

A fairly large number of useful algorithms in fields ranging from data compres- 
sion, coding theory and cryptography to compilers, circuit design and computa- 
tional complexity involve bijective functions between heterogeneous data types. 
Their systematic encapsulation in a generic API that coexists well with strong 
typing can bring significant simplifications to various software modules with the 
added benefits of reliability and easier maintenance. In a Genetic Programming 
context |37j the use of isomorphisms between bitvectors/natural numbers on 
one side, and trees/graphs representing HFSs, HFFs on the other side, looks 
like a promising phenotype-genotype connection. Mutations and crossovers in 
a data type close to the problem domain are transparently mapped to numer- 
ical domains where evaluation functions can be computed easily. In particular, 
"biological proven" encodings like DNA strands are likely to provide interest- 
ing genotypes implementations. In the context of Software Transaction Memory 
implementations (like Haskell's STM encodings through isomorphisms are 
subject to efficient shortcuts, as undo operations in case of transaction failure 
can be performed by applying inverse transformations without the need to save 
the intermediate chain of data structures involved. 



23 Related work 



The closest reference on encapsulating bijections as a Haskell data type is [39] 
and Conal Elliott's composable bijections module [40j . where, in a more complex 
setting, Arrows [H] are used as the underlying abstractions. While our I so 
data type is similar to the Bij data type in and BiArrow concept of [39] . 
the techniques for using such isomorphisms as building blocks of an embedded 
composition language centered around encodings as Natural Numbers are new. 

As the domains between which we define our isomorphisms can be organized 
as categories, it is likely that some of our constructs would benefit from natural 
transformation |42j and n-cafe^or?/ formulations [43!]. 

Ranking functions can be traced back to Godel numberings (5|F] associated to 
formulae. Together with their inverse unranking functions they are also used in 
combinatorial generation algorithms |44|27|45f46 . However the generic view of 
such transformations as hylomorphisms obtained compositionally from simpler 
isomorphisms, as described in this paper, is new. 

Natural Number encodings of Hereditarily Finite Sets have triggered the 
interest of researchers in fields ranging from Axiomatic Set Theory and Foun- 
dations of Logic to Complexity Theory and Combinatorics |47)48|49)50|51)52] . 
Computational and Data Representation aspects of Finite Set Theory have been 
described in logic programming and theorem proving contexts in |12l53j . 

Pairing functions have been used in work on decision problems as early as 
|20l23j . A typical use in the foundations of mathematics is [53]. An extensive 
study of various pairing functions and their computational properties is pre- 
sented in [55] . 

Various mappings from natural numbers to rational numbers are described 
in [56] ■ also in a functional programming framework. 

We have learned from Knuth's recent work on combinatorial algorithms [27] 
the techniques related to bitvector encodings of projection functions and boolean 
operations and about BDDs and reduced ordered BDDs from Bryant's seminal 
paper on the topic [26j . However, the connection with pairing/unpairing func- 
tions and the equivalence results of subsection |10.4| are new. 

The concepts of hereditarily finite functions and permutations as well as their 
encodings, are likely to be new, given that our sustained search efforts have not 
lead so far to anything similar. 

Some other techniques, ranging from factoradics to cons- lists and functional 
binary numbers to DNA encodings and dyadic rationals are for sure part of 
the scientific commons. In that case our focus was to express them as elegantly 
ans possible in a uniform framework. In these cases as well, most of the time it 
was faster to "just do it" , by implementing them from scratch in a functional 
programming framework, rather than adapting procedural algorithms found else- 
where. 



24 Conclusion 



Wc have shown the expressiveness of Haskell as a metalanguage for executable 
mathematics, by describing encodings for functions and finite sets in a uniform 
framework as data type isomorphisms with a groupoid structure. Haskell's higher 
order functions and recursion patterns have helped the design of an embed- 
ded data transformation language. Using higher order combinators a simplified 
QuickCheck style random testing mechanism has been implemented as an em- 
pirical correctness test. The framework has been extended with hylomorphisms 
providing generic mechanisms for encoding Hereditarily Finite Sets and Hered- 
itarily Finite Functions. In the process, a few surprising "free algorithms" have 
emerged as well as a generalization of Ackermann's encoding to Hereditarily Fi- 
nite Sets with Urelements. We plan to explore in depth in the near future, some 
of the results that are likely to be of interest in fields ranging from combina- 
torics and boolean logic to data compression and arbitrary precision numerical 
computations. 
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Appendix 



The code in the paper is organized in a module with the following dependencies: 

module ISO where 

import Data. List 

import Data. Bits 

import Data. Graph 

import Data . Graph . Inductive 

import Graphics .Gnuplot . Simple 

import Data. Char 

import Ratio 

import Random 



Bit crunching functions 

The function bitcount computes the number of bits needed to represent an in- 
teger and max.bitcount computes the maximum bitcount for a list of integers. 

bitcoiint n = head [x|x-(— [1. .] , (2"x)>n] 
max_bitcount ns = foldl max (map bitcount ns) 

The following function convert a number to to binary, padded with Os, up to 
maxbits. 

to_maxbits maxbits n 

03 -H- (genericTake (maxbits-1)) (repeat 0) where 
bs=^o_base 2 n 
l=genericLength bs 



Primes 

The following code implements factoring function to_primes a primality test 
(is_priine) and a generator for the infinite stream of prime numbers primes. 

primes = 2 : filter is_prime [3,5..] 

is_prime p = [p]=to_primes p 

to_primes n | n>l = to_factors n p ps where 
(p:ps) = primes 

to_f actors n p ps | p*p > n = [n] 

to_factors n p ps | 0=n 'mod' p = p : to_factors (n 'div' p) p ps 
to_f actors n p ps(3(hd:tl) = to_f actors n hd tl 

We will briefly describe here the functions used to visualize various data types 
with the help of Haskell libraries providing interfaces to graphviz and gnuplot. 



Multiset Operations 



The following functions provide multiset analogues of the usual set operations, 
under the assumption that multisets are represented as non-decreasing sequences. 

msetlnter [] _ = [] 
msetlnter _ [] = [] 

msetlnter (x:xs) (y:ys) | x==y = (x:zs) where zs=msetlnter xs ys 

msetlnter (x:xs) (y:ys) | x<y — msetlnter xs (y:ys) 

msetlnter (x:xs) (y:ys) | x>y = msetlnter (x:xs) ys 

msetDif [] _ = [] 
msetDif xs [] = xs 

msetDif (x:xs) (y:ys) | x==y = zs where zs=msetDif xs ys 

msetDif (x:xs) (y:ys) | x<y = (x:zs) where zs=msetDif xs (y:ys) 

msetDif (x:xs) (y:ys) | x>y = zs where zs=msetDif (x:xs) ys 

msetSymDif xs ys = sort ((msetDif xs ys) 4-f (msetDif ys xs)) 

msetUnion xs ys = sort ((msetDif xs ys) -H- (msetlnter xs ys) -H- (msetDif ys xs)) 

Building a multigraph from a natural number using a function 
associating to each naturad number a sequence or set of naturad 
numbers. 

fim2g ns = nat2fgs nat2fun ns 

set2g ns — nat2sgs nat2set ns 
perm2g ns — nat2fgs nat2perm ns 
pmset2g ns = nat2fgs nat2pmset ns 
bmset2g ns = nat2f gs nat2bmset ns 

nat2f g f n = nat2gx fim_edge f nat2pf tree n : : Gr Nat Int 
nat2f gs f ns = nat2gsx fim_eclge f nat2pf tree ns : : Gr Nat Int 
nat2sg f n = nat2gx set_edge f nat2pftree n : : Gr Nat () 
nat2sgs f ns = nat2gsx set_edge f nat2pftree ns : : Gr Nat () 
set_edge xs (a,b,i) = (lookUp a xs.lookUp b xs.O) 
fim_edge xs (a,b,i) = (lookUp a xs.lookUp b xs,i) 

nat2gx e f g n = mkGraph vs (map (e xs) es) where 

es=g f n 

(xs , vs)=labeledVertices es 



nat2gsx e f g ns = mkGraph vs (map (e xs) es) where 
es=^ub (concatMap (g f) ns) 



(xs,vs)=labeledVertices es 



labeledVertices es= (xs.vs) where 
xs=fvertices es 
is=[0. . (length xs)-l] 
vs = zip is xs 

nat2pftree f n = nub (nat2pftreex f (n,n,0)) 

nat2pftreex f (_,n,_) — ps 4+ (concatMap (nat2pftreex f) ps) where 

ps = nat2pfun f (n,n,0) 

nat2pfun _ (_,0,_) = [] 

nat2pfim f (_,n,_) | n> = ps where 

ps = zipWith (Ax i— >(n,x,i)) (f n) [0..] 

fvertices ps = (sort . nub) (concatMap f ps) where 
f (a,b._) = [a.b] 

lookup n ns = i where Just i=elemlndex n ns 
Building Inductive Graphs from Lists of Pairs 

pairs2gr :: [(Nat, Nat)] — » Gr Nat () 

pairs2gr ps = mkGraph Ivs les where 
vs=^o_vertices ps 
lvs=zip [0. .] vs 
es=^o_edges vs ps 
les=map f es 
f (x,y) = (x,y,()) 

to_vertices es = sort $ nub $ concat [[fst p.snd p] |p-<— es] 

to_edges vs ps — map (f vs) ps where 
f vs (x,y) = (lookUp x vs.lookUp y vs) 

Generating labeled edge triplets by recursing over unpairing 
functions 

The following function represents a number as a set of triplets expressing branches 
of decomposition with an unpairing function f , for instance, in the case of BDDs 
with function bitunpair. 

unpairing_edges f tt = nub (h f tt) where 
h _ tt I tt<2 = [] 
h f n = ys where 
(nO.nl)^ n 

ys= (n,nO,0) : (n,nl,l) : 
(h f nO) -H- 
(h f nl) 



The function works as follows: 
*ISO> unpairing_edges bitunpair 42 

[(42,0,0) , (42,7,1) , (7,3,0) , (7,1,1) , (3,1,0) ,(3,1,1)] 

*JFISO> iiiipairing_edges pepis_unpair 42 

[(42, 0,0), (42, 21,1), (21, 1,0), (21, 5.1), (5, 1.0), (5, 1,1)] 

*ISO> 



Generating labeled edge triplets by recursing over untupling 
functions 

The following function represents a number as a set of triplets expressing branches 
of decomposition with an untupling function f k, for instance to_tuple k. 

tmtupling_edges f k tt = nub (h f k tt) where 

h tt I tt<2 = [] 

h f k n = ys where 

ns = f k n 

ys = (zip3 (repeat n) ns [0..]) -H- 
(concatMap (h f k) ns) 

The function works as follows: 

*ISO> untupling_edges to_tuple 3 2008 

[(2008, 14,0), (2008, 14,1), (2008, 4, 2), (14, 2,0), (14, 1,1), (14, 1,2), 
(2. 0.0). (2. 1.1). (2, 0.2). (4, 0.0), (4, 0,1), (4, 1,2)] 



Building Inductive Graphs from Unpairing and Untupling Trees 

We can now turn a BDD as well as any other unpairing function generated tree 
into an inductive graph, as follows: 

to_impair_graph f tt = nat2f\m_graph (unpairing_edges f ) tt 

to_imtuple_graph f k tt = nat2f\m_graph (imtupling_edges f k) tt 

nat2fiin_graph f n = mkGraph vs f s : : Gr Nat Int where 
es=f n 

(xs . vs)=labeledVertices es 
fs=^ap (fim_edge xs) es 

The functions work as follows: 

*ISO> to_impair_graph bitunpair 42 

0:0^ [] 

1:1^[] 

2:3^[(1,1),(0,1)] 
3:7^[(0,2),(1,1)] 
4:42-»[(l,3).(0.0)] 



*ISO> to_impair_graph pepis_impair 42 



0:0^ [] 
l:l-^[] 

2:5^[(1,1),(0,1)] 

3:21->[(1,2),(0,1)] 

4:42-»[(l,3),(0,0)] 

*ISO> to_imtuple_graph to.tuple 3 2008 

0:0^ [] 

1:1^[] 

2:2^[(2,0),(1,1),(0,0)] 
3:4->[(2,l),(l,0),(0,0)] 
4:14->[(0,2),(2,1),(1,1)] 
5:2008^ [(2. 3), (1,4), (0,4)] 



Visualization with graphviz 

gviz g — writeFile "iso.gv" 

((graphviz g "" (0.0,0.0) (2,2) Portrait )-|-|-" An") 

fimviz f n = gviz (rLat2fg f n) 

setviz f n = gviz (rLat2sg f n) 

pviz t n = gviz (pairs2gr (as t nat n)) 

uviz f tt = gviz (to_unpair_graph f tt) 

tviz f k tt = gviz (to_uiituple_graph f k tt) 
Plotting with gnuplot 

plot3cl f xs ys = plotFimc3<i [Title ""] [] xs ys f 
cplot3<i f = plot3d (curry f ) 

plotpairs m | m<2~8 = cplotSd bitpair Is Is where ls=[0. .m-1] 

plotdyadics m = plotList 
[Title "Dyadics"] 

(map (f romRational . (as dyadic nat)) [0..m-l]) 
sizes_to m t = map (size_as t) [0..m-l] 

plot.hf m = plotLists [Title "Bit, BDD, HFF, HFS, and HFP sizes"] 
( 

[bits_to m,bsizes_to m] -H- 

(map (sizes_to m) [hf f ,hf s,hfm,hfp] ) 

) 



plot.best m = plotLists [Title "Bit, BDD and HFF and HFF' sizes"] 
( 

[bits_to m,bsizes_to m] -H- 
(map (sizes_to m) [hff.hff]) 

) 

plot.worse m = plotLists [Title "HFM, HFS and HFP sizes"] 
( 

(map (sizes_to m) [hfm.hf s.hfp] ) 

) 

plot hf m = plotx [hf ] m 

plotx hfx m = plotLists [Title "HF tree size"] 
( 

(map (sizes_to (2"m-l)) hfx) 

) 

— plots pairs 

pplot f m = plotPath [] (map (to_ints . f) [0..2'm-l]) 
zplot f m = plotPath [] (map (to.ints . f) [-(2"m) . .2~m-l] ) 
to_ints (i, j)=(fromIntegral i ,f romlntegral j) 
diplot n = plotPath [] (map to_ints (as digraph nat n)) 

bsize_of n = robdd_size (as rbdd nat n) 
bsizes_to m = map bsize_of [0. .m-1] 

bits.to m = map s [0. .m-1] where s n = genericLength (as bits nat n) 

plot_linear_sparseness m = plotLists [Title "Linear Sparseness"] 
[(map (linear_sparseness fxm) [0..m-l]), 
(map (linear_sparseness pmset) [0..m-l]), 
(map (linear_sparseness mset) [0..m-l]), 
(map (linear_sparseness set) [0..m-l]), 
(map (linear_sparseness perm) [0..m-l])] 

plot_sparseness m = plotLists [Title "Recursive Sparseness"] 
[(map (sparseness hff_pars) [0..m-l]), 
(map (speirseness hfpm_pars) [0..m-l]), 
(map (sparseness hfm_pars) [0..m-l]), 
(map (sparseness hfs.pairs) [0..m-l]), 
(map (speirseness hfp_pars) [0..m-l])] 



plot_sparsenessl m = plotLists 



[Title "Recursive Sequence vs. Multiset Sparseness"] 
[ 

(map (sparseness hff_pars) [0..m-l]), 

(map (spEirseness hfpm_pars) [0..m-l]) 

] 

plot_sparseness2 m = plotLists [Title "Recursive Multiset Speirseness"] 
[ 

(map (sparseness bhfm_pars) [0..m-l]), 

(map (speirseness hfm_pars) [0..m-l]) 

] 

plot_sparseness3 m = plotLists [Title "Recursive Multiset Spairseness"] 
[ 

(map (sparseness hff_pars) [0..m-l]), 

(map (sparseness hff_pars') [0..m-l]) 

] 



plot_sparseness4 m = plotLists 

[Title "Recursive Multiset vs Multiset with Primes Sparseness"] [ 
(map (spairseness hfm_pars) [0..m-l]), 
(map (spairseness hfpm_pars) [0..m-l]) 

] 

plot_spEirseness5 m = plotLists 
[Title "Recursive Multisets vs 

(map (sparseness hff_pars) [0 
(map (sparseness hfm_pars) [0 

] 

plot.self dels m = plotLists 

[Title "Self -delimiting codes: Undelimited vs. Elias vs. HFF"] 
[(map (genericLength . (as bits nat)) [0..m-l]), 
(map (genericLength . (as elias nat)) [0..m-l]), 
(map (genericLength . (as hff.pars nat)) [0..m-l])] 

plot_pairs_prods m = plotLists [Title "Pairs vs. products"] 
[ms, prods] where 
ms=[l . .m] 

pairs=map bitunpair ms 

prods=^ap prod pairs where prod (x,y)=2*x*y 

plot.lif ted_pairs m = 

plotLists [Title "Lifted pairs"] [usO.usl] where 
ms=[0. .m-1] 

pairs=aiap bitunpair ms 
usO^map fst pairs 
usl=map snd pairs 



. Sequences"] [ 

. .m-1]), 
. .m-1]) 



plot_lif ted_pairsl m = 

plotLists [Title "Lifted pairs and products"] [ps,sO,sl,xys] where 
ms=[0 . .m-1] 

pairs=map bitimpair ms 

usO^map fst pairs 
usl=map snd pairs 
ps=zipWith (*) usO usl 
sO=map ("2) usO 
sl=map ("2) usl 
xys=map f pairs where 
f (x,y) = x*y 

plot_primes_prods m = plotLists [Title "Primes vs. products"] 
[ps , prods] where 
ms=[0. .m] 

ps=genericTake m primes 
pairs=map bitimpair ps 

prods=map prod pairs where prod (x,y)=2*x*y 

plot_hypers_prods m = plotLists [Title "Hyper-primes vs. products"] 
[ps, prods] where 
ms=[0. .m] 

ps=^enericTake m (hyper_primes bitimpair) 
pairs=map bitimpair ps 

prods=map prod pairs where prod (x,y)=2*x*y 
Generated Figures 

fl=gviz (nat2sg nat2set 2008) 
f2=gviz (nat2fg nat2fun 2008) 
f2ap=gviz (nat2fg nat2mset 2008) 
f3=^iz (nat2fg nat2perm 2008) 

f4=^viz (nat2fg nat2perm 2009) 
f5=^viz digraph 2008 

f6=^lotpairs 64 
f7=^lotdyadics 256 

f8=^lot_best (2~6) 
f9t^lot_worse (2"10) 
flO=Tilot_hf (2-8) 

f lla=^lot_linear_sparseness (2^7) 
f ll=plot_spairseness (2*8) 

f llb=plot_sparsenessl (2^8) 
f llc=plot_sparseness2 (2~10) 



f 12=^lot_spcirseness (2'14) 



f 13=plot_sparseness (2~17) 



f 14=^lot_selfdels (2"7) 
fl5=plotList [] (sparses.to (2*18)) 

fl6=gviz (nat2fgs nat2fun [0..7]) 

arp24 i =468395662504823 + 205619*23*1 

arps24 = map arp24 [0 . . 23] 

arp25 i = 6171054912832631 + 366384*23*1 

arps25 = map arp25 [0 . . 24] 

f 17 = gvlz (fim2g arps24) 
f 17a = gvlz (fvin2g arps25) 

f 18 = gvlz (fim2g [2*65+1,2-131+3]) 

f 18a = gvlz (set2g [2-65+1,2*131+3]) 

f 19 = gvlz (fuii2g [0..7]) 

f 20 = gvlz (pmset2g [0..7]) 

f20a = gvlz (bmset2g [0. .7]) 

f 21 = gvlz (set2g [0..7]) 

f 22 = gvlz (perm2g [0..7]) 

gl tt= uvlz bltunpalr tt 

g2 tt= uvlz pepls_unpalr tt 

g2' tl^ uvlz pepis_unpair ' tt 

g3 tt= uvlz rpepls_unpalr tt 

Isof erma'fc=uvlz mset_unpalr 65537 

Isof ermatl=tivlz mset_uiipalr 142781101 

Isonf ermat=uvlz inset_uiipalr 34897 

Isopalrs = plot_palrs_prods 256 
Isoprlmes = plot_prlmes_pro<is 256 
Isohypers = plot_hypers_pro<is 256 

lsounpalrl=pplot bltunpalr 10 
lsoimpalr2=^plot pepls_unpalr 10 
lsoimpalr3=^plot mset_unpalr 10 



isozunpair rp=zplot zunpair n 



ms2pms n = as nat pmset (as mset nat n) 

pms2ms n = as nat mset (as pmset nat n) 
kms2pms n = n 

kms2pms k n = ms2pms (]aiis2pms (k-1) n) 
kpms2ms n = n 

kpms2ms k n = pms2ms (kpms2ms (k-1) n) 

1ms k m = [x|x*— [0. .2"m-l] , kms2pms k x < kpms2ms k x] 

xms k m = [x | [0 . . 2~m-l] ,kms2pms k x < x] 

eqms k m = [x|x<— [0. .2~m-l] ,kms2pms k x = x] 

xpms k m = [x|x-(— [0. .2"m-l] ,kpms2ms k x < x] 

eqpms k m = [x|x-(— [0. .2"m-l] ,kpms2ms k x = x] 

qms k m = 

[(toRational (kpms2ms k x)) - (toRational (kms2pms k x) ) | x-<— [1 . . 2"m-l] ] 

ql k m = plotList [] (qms k m) 

q2 k m = plotLists [] 

[map (kms2pms k) xs.map (kpms2ms k) xs] where 
xs = [0. .2'm-l] 

mult_vs_pairing pi p2 — (pl*p2) (ppair bitpair (pl,p2)) 
mult_vs_mset_pairing pi p2 = (pl*p2) (ppair mset_pair (pl,p2)) 

q3 n = plotFuncSd 

[Title "Prime Multiplication vs. Prime Pairing"] [] 

ps ps mult_vs_pairing where 
ps=^enericTake n primes 

q4 n = plotFuncSd 

[Title "Prime Multiplication vs. Prime Multiset Pairing"] [] 

ps ps mult_vs_mset_pairing where 
ps==genericTake n primes 

n4a n = plotFuncSd [Title "Multiplication"] [] 
ps ps (*) where 
ps=[0. .2"n-l] 



n4b n = plotFuncSd [Title "Multiset Pairing"] [] 



ps ps (curry mset_pair) where 
ps=[0. .2~n-l] 



n4c n = plotFuncSd [Title "mprod operation"] [] 
ps ps (mprod) where 
ps=[0. .2"rL-l] 



n4d n = plotFuncSd [Title "pmprod' operation"] [] 

ps ps (pmprod') where 
ps=[0. .2~n-l] 



n4e n = plotFuncSd [Title "mprod' operation"] [] 
ps ps (mprod') where 
ps=[0. .2~n-l] 



n4f n = plotFunc3d [Title "mprod' x y/ x * y"] [] 
ps ps (Ax y—> (mprod' x y) % (x*y)) where 
ps=[l. .2~n] 

expMexp k m = plotLists [] 

[map (Ax^x'k) xs, map (Ax— »mexp' x k) xs] where 
xs = [0. .2"m] 



p4a n = plotFunc3d [Title "Prime Multiplication"] [] 

ps ps (*) where 

ps=genericTake n primes 



p4b n = plotFuncSd [Title "Prime Multiset Pairing"] [] 
xs ys (curry mset.pair) where 
ps=genericTake n primes 
xs=^s 
ys=^s 

p4c n = plotFunc3d [Title "mprod on primes"] [] 
xs ys (mprod) where 
ps=genericTake n primes 
xs=^s 
ys=^s 



p4d n = plotFuncSd [Title "pmprod on primes"] [] 
xs ys (pmprod) where 
ps=^enericTake n primes 
xs=^s 
ys=^s 



p4f n = plotFuncSd [Title "mprod' x y/ x * y"] [] 
ps ps (Ax y—> (mprod' x y) (x*y)) where 



ps=genericTake n primes 



q4c n = plotFimcSd [Title "Prime Pairing"] [] 
ps ps (curry bitpair) where 
ps=^enericTaie n primes 

q5 n = plotLists 

[Title "Prime Multiplication vs. Prime Pairing curves"] 
[prods , pairs] where 

us= map bitunpair [0..2"n-l] 

(xs,ys) = unzip us 

ps=^rimes 

xs'=map (from_pos_in ps) xs 
ys'=map (from_pos_in ps) ys 
prods — zipWith (*) xs' ys' 
us'=zip xs' ys' 

pairs= map (ppair mset_pair) us' 

plot_gauss_op f m = plotFuncSd title [] zs zs (curry f ) where 

title=[Title "Gauss Integer operations through Pairing Functions"] 
zs=[-2"m. .2"m-l] 

gs^m m = plot_gauss_op gauss_sum m 
gdif m = plot_gauss_op gauss_dif m 

gprod m = plot_gauss_op gauss_prod m 



